---
title: "\\Large Risk of a feedback loop between climatic warming and human mobility"
author: 
- Nick Obradovich^[Corresponding author, nobradov@mit.edu. Media Lab, Massachusetts Institute of Technology.]
- Iyad Rahwan^[Media Lab and Institute for Data, Systems, and Society, Massachusetts Institute of Technology.]
date: ""
abstract: "Human behaviors alter -- and are altered by -- climate. Might the impacts of warming on human behaviors amplify anthropogenic contributions to climate change? Here we show that warmer temperatures substantially increase transportation use in the United States. To do so, we combine meteorological data with data on vehicle miles traveled and trips on public transit between 2002 and 2018. Moving from freezing temperatures up to 30°C increases vehicle miles traveled by over 10% and amplifies use of public transit by nearly 15%. Temperatures beyond 30°C exert little influence on either outcome. We then examine climate model projections to highlight the possible transportation impacts of future climatic changes. We project that warming over the coming century may add over one trillion cumulative vehicle miles traveled and six billion trips on public transit in the US alone, presenting the risk of a novel feedback loop in the human-environmental system."
output: 
  pdf_document:
    fig_caption: true
    keep_tex: true
bibliography: references.bib
fontsize: 11pt
header-includes:
    - \usepackage{caption}
    - \usepackage{bbm}
---

\captionsetup[table]{labelformat=empty}
\captionsetup[figure]{labelformat=empty}

```{r opts, echo=FALSE, message=FALSE, warning=FALSE, include=FALSE}
knitr::opts_chunk$set(fig.path='/', warning=F, message=F, fig.retina=NULL, cache=T, cache.lazy=F, autodep=T, echo=F, eval=T)
```

```{r packages}
# packages----
library(bit64) # long integers
library(ggthemes) #themes for plotting
library(sandwich) #robust se estimator
library(lmtest) #lm functions
library(ordinal) #ologit funcitons
library(memisc) #import stata data
library(ggmap) #make nice plots with ggplot mapping
library(ggplot2) #nice plots
library(Hmisc) #multipurpose package
library(RColorBrewer) #make nice colors
library(grid) #plotting package
library(MASS) #multipurpose package
library(gplots) #nice plotting
library(lfe) #run linear fixed effects models
library(stargazer) #output nice tables
library(lubridate) #deal with dates
library(zoo) #deal with dates
library(reshape2) #powerful reshape package
library(sp) #general spatial class package
library(gstat) #geostat package
library(spacetime) #store data in proper spatial/temporal class
library(raster) #convert to raster and vice versa
library(maptools) #plot and modify shapefiles
library(rgeos) #for use with maptools for shapefiles
library(rgdal) #for reading in shapefiles
library(RSAGA) #needed for spatial downscaling
library(RCurl) #also needed for spatial downscaling
library(dplyr) #for data frame manipulation functions
library(plyr) #for additional manipulation functions
library(data.table) #for data.table functions
library(foreign) #for reading in .dta files
library(parallel) #for parallel processing functions
library(foreach) #for plyr parallel
library(doMC) #run multicore processes
library(caTools) #functions for fast running mean and sd
library(ncdf4) #tools for netCDF packages
library(stringr) #string tools
library(knitr) #knitting
library(pander) #pandering
library(rasterVis) # plotting rasters
library(GGally) # for correlation plots
library(geosphere) # spherical distance
library(Rcpp) # for C++
library(RcppArmadillo) # for C++ (and Conley errors)
library(matrixStats) # matrix calculations
library(ggExtra) # for marginal histograms
library(gridExtra) # for multiplots
library(viridis) # for colors
library(Cairo) # pdf device
library(gtable) # for gtable filter
library(devtools)
#devtools::install_github("wilkelab/cowplot")
library(cowplot)
library(captioner)
library(splines)
library(survey) # svycontrast

# set cores for parallel processing----
registerDoMC(20)
```

```{r functions}
# functions for grabbing p-values and coefficients from summaries of felm models----
coefs <- function(reg, name) {
    if (round(reg$coefficients[rownames(reg$coefficients)==name, 1], 3) == 0) {
        format(round(reg$coefficients[rownames(reg$coefficients)==name, 1], 4), scientific=FALSE) 
    }
    else round(reg$coefficients[rownames(reg$coefficients)==name, 1], 3)
}
pval <- function(reg, name) {
  if (round(reg$coefficients[rownames(reg$coefficients)==name, 4], 3) >=0.001) 
  {round(reg$coefficients[rownames(reg$coefficients)==name, 4], 3)} 
  else paste("< 0.001")
}

# captioner setting----
fig_num <- captioner(prefix="Fig.")
tab_num <- captioner(prefix="Table")
eq_num <- captioner(prefix="Equation")
sfig_num <- captioner(prefix="Fig. S", auto_space=FALSE)
stab_num <- captioner(prefix="Table S", auto_space=FALSE)
seq_num <- captioner(prefix="Equation S", auto_space=FALSE)

# binned plot----
## convert felm result into a data frame with rhs | coef | se | xmin | xmax | xmid | ci.l | ci.h,
## replacing Inf and -Inf with in xmin and xmax on the edge bins with the graphically spaced bound
## felm.est: result from felm() call (lm will probably work too) with RHS factor variable like cuttmax or bins
## plotvar: string of variable to plot: e.g. 'cuttemp'
## breaks: bin size
## omit: omitted bin
bin.plot <- function(felm.est, plotvar, breaks, omit, panel="panel", ylabel="y-label", xlabel="x=label",
                     y.axis.title = element_text(size=25, angle=90, margin = margin(t = 0, r = 10, b = 0, l = 0), face="bold"),
                     ylimit = c(min(dt$ci.l) - (max(dt$ci.h) - min(dt$ci.l))/10,max(dt$ci.h)),
                     y.axis.text = element_text(size=13, angle=0, face="bold"),
                     y.axis.line = element_line(),
                     y.axis.ticks = element_line(),
                     errorline="gray60", errorfill="transparent", linecolor="#FB8C00", pointfill="#FB8C00", pointcolor="white") {

    dt <- as.data.frame(summary(felm.est)$coefficients[, 1:2]) # extract from felm summary (use df to keep coefficient names)
    dt <- dt[grepl(plotvar, rownames(dt)), ] # extract variable name (e.g., tmax.cut, ppt.cut, ...), limit to matches
    dt$rhs <- str_split_fixed(rownames(dt), "   ", n=2)[,1]
    dt <- as.data.table(dt)
    dt[, rhs := gsub(paste0(plotvar), "", rhs)]
    dt[, rhs := gsub("\\(|]", "", rhs)]
    dt[, rhs := gsub("neg", "-", rhs)]
    dt[, rhs := gsub("pos", "", rhs)]
    dt[grepl(",", rhs), rhs := tstrsplit(rhs, ',')[2]] # if a factor variable, grab second column (upper limit)
    dt[, rhs := as.numeric(rhs)]
    dt <- dt[!is.na(rhs)]
    dt[, xmin := rhs - breaks]
    dt[, xmax := rhs]
    dt[, rhs := NULL]
    setnames(dt, c("coef", "se", "xmin", "xmax"))
    dt[, xmin.lead := data.table::shift(xmin, type="lead")]
    dt[, xmax.lag := data.table::shift(xmax, type="lag")]
    dt[xmin==-Inf, xmin := xmin.lead - breaks]
    dt[xmax==-Inf, xmax := xmin.lead]
    dt[xmin==Inf, xmin := xmax.lag]
    dt[xmax==Inf, xmax := xmin + breaks]
    dt[, c('xmin.lead','xmax.lag') := NULL]
    # identify omitted category, add 0.
    dt <- rbind(dt, data.table(coef=0, se=0, xmin=omit[1], xmax=omit[2]))
    dt[, ':='(xmid=(xmin+xmax)/2, ci.l=coef - se*1.96, ci.h=coef + se*1.96)] # add some stuff for plotting
    dt[, range := paste(xmin, xmax, sep="-")] # create range variable for x labels
    dt[xmax == min(xmax), range := paste0("<=",min(xmax))] # set bottom of range
    dt[xmin == max(xmin), range := paste0(">",min(xmin))] # set top of range

    plot <- ggplot() +
        geom_hline(aes(yintercept=0), linetype=2, color='gray60') +
        geom_ribbon(data=dt, aes(x=xmid, ymin=ci.l, ymax=ci.h), fill=errorfill, alpha=0.1) +
        geom_line(data=dt, aes(x=xmid, y=ci.l), linetype="dotted", color=errorline, size=1.5, alpha=1) +
        geom_line(data=dt, aes(x=xmid, y=ci.h), linetype="dotted", color=errorline, size=1.5, alpha=1) +
        geom_line(data=dt, aes(x=xmid, y=coef), color=linecolor, size=1.75, alpha=1) +
        geom_point(data=dt, aes(x=xmid, y=coef), size=3, fill=pointfill, color=pointcolor, pch=21, alpha=1) +
        annotate("text", label=panel, x = min(dt$xmid) + (max(dt$xmid) - min(dt$xmid))/10, y = min(dt$ci.l) - (max(dt$ci.h) - min(dt$ci.l))/10, size=9) + # plot panel annotation
        ylab(ylabel) +
        xlab(xlabel) +
        coord_cartesian(xlim=c(min(dt$xmid), max(dt$xmid)), ylim=ylimit, expand=TRUE) +
        scale_x_continuous(breaks = dt$xmid, labels = dt$range) +
        theme(plot.title = element_text(face="bold", size=40),
              axis.title.x = element_text(size=20, angle=0, face="bold"),
              axis.title.y = y.axis.title,
              axis.text.y = y.axis.text,
              axis.text.x = element_text(size=13, face="bold"),
              panel.grid.major = element_blank(),
              panel.grid.minor = element_blank(),
              panel.background = element_blank(),
              axis.line.x = element_line(),
              axis.line.y = y.axis.line,
              axis.ticks.y = y.axis.ticks,
              legend.title=element_blank()
        )
    return(plot)
}

# spline functions----
## vmt
vmt_estimate_spline <- function(data, dv, iv, lhs) {
  ## keep spline knots with the fitted model results for plotting
  spl_knots <- wtd.quantile(iv[!is.na(dv)], weights=dv[!is.na(dv)], probs = c(0.25, 0.5, 0.75), na.rm=T) # get quantiles of iv weighted by dv (cannot have missing values)
  spl_boundary_knots <- range(iv, na.rm=T) # get range of independent variable
  xs <- as.data.table(ns(iv, knots = spl_knots, Boundary.knots = spl_boundary_knots)) # get cubic splines of iv
  setnames(xs, paste0("ns_", 1:ncol(xs))) # set names of splines
  data <- cbind(data, xs)
  ## set up felm model
  rhs <- paste(c(names(xs), "trange", "prcp.days", "cloud", "humid", "wind"), collapse = " + ")
  fe <- paste(c("state:month", "state:year"), collapse = " + ")
  fmla <- as.formula(sprintf("%s ~ %s | %s | 0 | state", lhs, rhs, fe))
  fit_felm <- felm(fmla, data=data)
  pred_tmax <- seq(min(iv, na.rm=T), max(iv, na.rm=T), length.out = 200) # prediction iv
  xp <- as.data.table(ns(pred_tmax, knots = spl_knots, Boundary.knots = spl_boundary_knots)) 
  result <- foreach::foreach (i = 1:nrow(xp), .combine = rbind) %do% {
    this.x <- pred_tmax[i]
    this.row <- c(unlist(xp[i, ]), 0, 0, 0, 0, 0) # include zeros for meteorological controls
    temp.dt <- as.data.table(svycontrast(fit_felm, this.row))
    temp.dt[, x := this.x]
    setnames(temp.dt, c("coef", "se", "x"))
    return(temp.dt)
  }
  return(result)
} # modified function from patrick baylis

## upt
upt_estimate_spline <- function(data, dv, iv, lhs) {
  ## keep spline knots with the fitted model results for plotting
  spl_knots <- wtd.quantile(iv[!is.na(dv)], weights=dv[!is.na(dv)], probs = c(0.25, 0.5, 0.75), na.rm=T) # get quantiles of iv weighted by dv (cannot have missing values)
  spl_boundary_knots <- range(iv, na.rm=T) # get range of independent variable
  xs <- as.data.table(ns(iv, knots = spl_knots, Boundary.knots = spl_boundary_knots)) # get cubic splines of iv
  setnames(xs, paste0("ns_", 1:ncol(xs))) # set names of splines
  data <- cbind(data, xs)
  ## set up felm model
  rhs <- paste(c(names(xs), "trange", "prcp.days", "cloud", "humid", "wind"), collapse = " + ")
  fe <- paste(c("uza:month", "uza:year"), collapse = " + ")
  fmla <- as.formula(sprintf("%s ~ %s | %s | 0 | uza", lhs, rhs, fe))
  fit_felm <- felm(fmla, data=data)
  pred_tmax <- seq(min(iv, na.rm=T), max(iv, na.rm=T), length.out = 200) # prediction iv
  xp <- as.data.table(ns(pred_tmax, knots = spl_knots, Boundary.knots = spl_boundary_knots)) 
  result <- foreach::foreach (i = 1:nrow(xp), .combine = rbind) %do% {
    this.x <- pred_tmax[i]
    this.row <- c(unlist(xp[i, ]), 0, 0, 0, 0, 0) # include zeros for meteorological controls
    temp.dt <- as.data.table(svycontrast(fit_felm, this.row))
    temp.dt[, x := this.x]
    setnames(temp.dt, c("coef", "se", "x"))
    return(temp.dt)
  }
  return(result)
} # modified function from patrick baylis
```

```{r data}
load('./replication.RData')
```

```{r models}
# vehicle miles traveled main bootstrap----
# boot.results <- foreach::foreach (b = 1:1000, .combine = rbind) %dopar% {
#                     bstrap_sample <- vmt[.(sample(unique(state), replace = T))] # block bootstrap sample on state
#                     a <- vmt_estimate_spline(data=bstrap_sample, dv=bstrap_sample$log_vmt, iv=bstrap_sample$tmax, lhs="log_vmt")[, b := b] # estimate spline
#                     return(a)
#                   }
# main.result <- vmt_estimate_spline(data=vmt, dv=vmt$log_vmt, iv=vmt$tmax, lhs="log_vmt")[, b:= 0] # main
# boot.results <- rbind(boot.results, main.result) # combine main and bootstrapped results
# boot.results[, bmax := max(coef, na.rm=T), by = b] # get max of that bootstrap's prediction
# boot.results[, coef_norm := coef - bmax] # normalize so max=0
# saveRDS(boot.results, "./vmt/main_bootstrap.rds")
# rm(main.result, boot.results)
vmt.main <- readRDS("./vmt/main_bootstrap.rds")

# public transit trips main bootstrap----
# boot.results <- foreach::foreach (b = 1:1000, .combine = rbind) %dopar% {
#                     k <- 0
#                     while(k!=1000){
#                           bstrap_sample = try(upt[.(sample(unique(uza), replace = T))])
#                           if (class(bstrap_sample)=="try-error") {
#                              k <- k+1
#                              print(k)
#                              } else break
#                           } # block bootstrap sample on city, don't oversample (code handles errors)
#                     a <- upt_estimate_spline(data=bstrap_sample, dv=bstrap_sample$log_trips, iv=bstrap_sample$tmax, lhs="log_trips")[, b := b] # estimate spline
#                     return(a)
#                   }
# main.result <- upt_estimate_spline(data=upt, dv=upt$log_trips, iv=upt$tmax, lhs="log_trips")[, b:= 0] # main
# boot.results <- rbind(boot.results, main.result) # combine main and bootstrapped results
# boot.results[, bmax := max(coef, na.rm=T), by = b] # get max of that bootstrap's prediction
# boot.results[, coef_norm := coef - bmax] # normalize so max=0
# saveRDS(boot.results, "./upt/main_bootstrap.rds")
# rm(main.result, boot.results)
upt.main <- readRDS("./upt/main_bootstrap.rds")
```

```{r fig1, eval=FALSE}
# chloropleth of vehicle miles and public transit----
## vmt
names(states)[6] <- "state" # fix names
states$state <- as.character(states$state)
vmt.state <- vmt[, .(log_vmt = log(sum(vmt*1000000, na.rm=T))), by=.(state)]
vmt.state[, state := as.character(state)]
vmt.state <- sp::merge(states, vmt.state, by="state")

p1 <- spplot(
       obj=vmt.state,
       zcol=c("log_vmt"),
       strip=FALSE,
       col.regions=colorRampPalette(c("#d5ed76","#FB8C00"))(100), # set colors
       as.table = TRUE,
       col=NA,
       xlab=NULL, # kill x-axis
       ylab=NULL, # kill y-axis
       scales=list(draw=FALSE),
       margin=FALSE, # remove margin padding
       colorkey=list(space="bottom", labels=list(cex=1.2), height=0.5), # make legend on bottom of plot
       at=seq(min(vmt.state$log_vmt, na.rm=TRUE), max(vmt.state$log_vmt, na.rm=TRUE)+0.000001, length.out=100),
       par.settings = plasmaTheme(),
       sub=list("log(Total Vehicle Miles Traveled)", cex=1.4)
       )
p1 <- update(p1, par.settings = list(axis.line = list(col = "transparent"),
                                   par.main.text=list(cex=2),
                                   panel.background=list(col="transparent")))
p1 <- p1 + layer_(sp.polygons(obj=us,
                              fill="white",
                              col="black"))
p1 <- p1 + layer(sp.lines(obj=roads,
                              fill="white",
                              col="black",
                              cex=0.01,
                              alpha=0.2))
p1 <- p1 + layer(sp.lines(obj=states,
                              fill="transparent",
                              col="black",
                              cex=0.1))

## public transit
points <- upt[, .(lat = unique(lat), lon = unique(lon)), by=.(uza)]
coordinates(points) <- ~lon + lat
projection(points) <- projection(states)
uza.state <- cbind(as.data.table(points), as.data.table(over(points, states)))
uza.state <- uza.state[, .(uza, state)]
setkey(uza.state, uza)
setkey(upt, uza)
upt.state <- merge(upt, uza.state, all.x=T)
upt.state <- upt.state[, .(log_trips = log(sum(trips, na.rm=T))), by=.(state)]
upt.state[is.infinite(log_trips), log_trips := 0]
upt.state[, state := as.character(state)]
upt.state <- sp::merge(states, upt.state, by="state")

p2 <- spplot(
       obj=upt.state,
       zcol=c("log_trips"),
       strip=FALSE,
       col.regions=colorRampPalette(c("#d5ed76","#19A1FC"))(100), # set colors
       as.table = TRUE,
       col=NA,
       xlab=NULL, # kill x-axis
       ylab=NULL, # kill y-axis
       scales=list(draw=FALSE),
       margin=FALSE, # remove margin padding
       colorkey=list(space="bottom", labels=list(cex=1.2), height=0.5), # make legend on bottom of plot
       at=seq(min(upt.state$log_trips, na.rm=TRUE), max(upt.state$log_trips, na.rm=TRUE)+0.000001, length.out=100),
       par.settings = plasmaTheme(),
       sub=list("log(Total Public Transit Trips)", cex=1.4)
       )
p2 <- update(p2, par.settings = list(axis.line = list(col = "transparent"),
                                   par.main.text=list(cex=2),
                                   panel.background=list(col="transparent")))
p2 <- p2 + layer_(sp.polygons(obj=states,
                              fill="white",
                              col="black"))
p2 <- p2 + layer(sp.lines(obj=roads,
                              fill="white",
                              col="black",
                              cex=0.01,
                              alpha=0.2))
p2 <- p2 + layer(sp.lines(obj=states,
                              fill="transparent",
                              col="black",
                              cex=0.1))

# vehicle miles and public transit trips time series----
vmttime <- vmt[, .(tot_vmt = sum(vmt*1000000, na.rm=T)), by=.(monthyr)]
vmttime.plot <- ggplot(vmttime, aes(x=monthyr, y=tot_vmt)) +
                    ylab("US Vehicle Miles Traveled") +
                    xlab("Calendar Month") +
                    geom_line(color='gray80', size=1.5) + 
                    geom_area(fill="#FB8C00", alpha=0.8) +
                    scale_x_continuous(breaks = c(2004,2006,2008,2010,2012,2014,2016,2018)) +
                    scale_y_continuous(limits = c(0e+00, 3e+11), labels = function(x) formatC(x, format = "e", digits = 0)) +
                    theme(plot.title = element_text(face="bold", size=40),
                          axis.title.x = element_text(size=20, face="bold"),
                          axis.title.y = element_text(size=20, face="bold"),
                          axis.text.y = element_text(size=15, face="bold"),
                          axis.text.x = element_text(size=15, face="bold"),
                          axis.ticks.y = element_line(),
                          panel.grid.major = element_blank(),
                          panel.grid.minor = element_blank(),
                          panel.background = element_blank(),
                          axis.line.x = element_line(),
                          axis.line.y = element_line()
                    )

upttime <- upt[, .(tot_upt = sum(trips, na.rm=T)), by=.(monthyr)]
upttime.plot <- ggplot(upttime, aes(x=monthyr, y=tot_upt)) +
                    ylab("US Public Transit Trips") +
                    xlab("Calendar Month") +
                    geom_line(color='gray80', size=1.5) + 
                    geom_area(fill="#19A1FC", alpha=0.8) +
                    scale_x_continuous(breaks = c(2004,2006,2008,2010,2012,2014,2016,2018)) +
                    scale_y_continuous(labels = function(x) formatC(x, format = "e", digits = 0)) +
                    theme(plot.title = element_text(face="bold", size=40),
                          axis.title.x = element_text(size=20, face="bold"),
                          axis.title.y = element_text(size=20, face="bold"),
                          axis.text.y = element_text(size=15, face="bold"),
                          axis.text.x = element_text(size=15, face="bold"),
                          axis.ticks.y = element_line(),
                          panel.grid.major = element_blank(),
                          panel.grid.minor = element_blank(),
                          panel.background = element_blank(),
                          axis.line.x = element_line(),
                          axis.line.y = element_line()
                    )

# plot----
cairo_pdf(filename = './figure1.pdf', width=18, height=10)
grid.arrange( 
             arrangeGrob(p1, p2, ncol=2),
             arrangeGrob(rectGrob(gp=gpar(col="white")), vmttime.plot, rectGrob(gp=gpar(col="white")), upttime.plot, rectGrob(gp=gpar(col="white")), ncol=5, widths=c(0.04, 0.35,0.15,0.35, 0.11)),
             nrow=2, ncol=1, heights=c(0.6, 0.4))
grid.text("a", x=unit(0.0145, "npc"), y=unit(0.975, "npc"), gp=gpar(fontsize=40,font=2))
grid.text("b", x=unit(0.5145, "npc"), y=unit(0.975, "npc"), gp=gpar(fontsize=40,font=2))
grid.text("c", x=unit(0.125, "npc"), y=unit(0.11, "npc"), gp=gpar(fontsize=40,font=2))
grid.text("d", x=unit(0.6275, "npc"), y=unit(0.11, "npc"), gp=gpar(fontsize=40,font=2))
dev.off()
```

```{r fig2, eval=FALSE}
# vehicle miles traveled----
vmt.main.plot <- ggplot(vmt.main[b!=0], aes(x=x, y=coef_norm, group=b)) + 
                  geom_hline(aes(yintercept=0), linetype=2, color='gray80') +
                  geom_line(color="#ffc072", alpha=0.03, size=0.25) + 
                  geom_line(data=vmt.main[b==0,], aes(x=x, y=coef_norm), color="#FB8C00", size=4, alpha=1) +
                  coord_cartesian(ylim=c(-0.32, 0.01)) +
                  scale_x_continuous(breaks = seq(-10,40,10)) +
                  ylab("Log(Vehicle Miles Traveled)") +
                  xlab("Monthly Average Maximum Temperature in \u2103") +
                  theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=20, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = 'none'
                  )
hist <- axis_canvas(vmt.main.plot, axis = "x") + 
        geom_histogram(data = vmt, aes(x = tmax), bins=30, fill="#FB8C00", color='#d5ed76', alpha=0.5, size=0.1)
vmt.main.plot <- insert_xaxis_grob(vmt.main.plot, hist, position = "bottom", height = grid::unit(0.075, "null"))
vmt.main.plot <- ggdraw(vmt.main.plot)

# vmt deconvolved data----
vmt[, statemonth := as.factor(paste0(state, month))]
vmt[, stateyear := as.factor(paste0(state, year))]

de.vmt <- as.data.table(demeanlist(mtx = list(vmt$tmax, vmt$log_vmt), fl = list(vmt$statemonth, vmt$stateyear)))
setnames(de.vmt, c('tmax.demean.all','vmt.demean.all'))
demeaned.vmt <- cbind(vmt, de.vmt)
de.vmt <- as.data.table(demeanlist(mtx = list(vmt$tmax, vmt$log_vmt), fl = list(vmt$state)))
setnames(de.vmt, c('tmax.demean.unitonly','vmt.demean.unitonly'))
demeaned.vmt <- cbind(demeaned.vmt, de.vmt)

# vmt deconvolved log_vmt----
p1 <- ggplot(demeaned.vmt[statecode=="NM",], aes(x=monthyr)) + 
    ylab("Log(VMT)\nAnomaly") +
    xlab("Calendar Month") +
    geom_hline(yintercept = 0, linetype="dashed") +
    geom_line(aes(y=vmt.demean.unitonly, color='#d5ed76'), size=0.25, alpha=1) +
    geom_line(aes(y=vmt.demean.all, color="#FB8C00"), size=0.75, alpha=1) +
    scale_color_manual(limits=c("#d5ed76","#FB8C00"), values=c("#d5ed76","#FB8C00"), labels=c("State FE", "State-by-Month & State-by-Year FE")) +
    guides(color = guide_legend(override.aes = list(size=5))) +
    coord_cartesian(ylim=c(-0.35, 0.2), expand=TRUE) +
    scale_x_continuous(breaks = c(2004,2008,2012,2016)) +
    scale_y_continuous(breaks = c(-0.2, -0.1, 0, 0.1), labels = function(x) formatC(x, format = "f", digits = 2)) +
    theme(plot.title = element_text(face="bold", size=40),
          axis.title.x = element_text(size=15, angle=0, face="bold"),
          axis.title.y = element_text(size=15, angle=90, face="bold"),
          axis.text.y = element_text(size=12, face="bold"),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.x = element_line(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_line(),
          legend.title=element_blank(),
          legend.position = c(0.025,0.15),
          legend.text=element_text(size=9, face="bold"),
          legend.key = element_rect(colour = "black")
    )

# vmt deconvolved tmax----
p2 <- ggplot(demeaned.vmt[statecode=="NM",], aes(x=monthyr)) + 
    ylab("Max. Temp.\nAnomaly") +
    xlab("Calendar Month") +
    geom_hline(yintercept = 0, linetype="dashed") +
    geom_line(aes(y=tmax.demean.unitonly, color='#d5ed76'), size=0.25, alpha=1) +
    geom_line(aes(y=tmax.demean.all, color="#FB8C00"), size=0.75, alpha=1) +
    scale_color_manual(limits=c("#d5ed76","#FB8C00"), values=c("#d5ed76","#FB8C00"), labels=c("State FE", "State-by-Month & State-by-Year FE")) +
    guides(color = guide_legend(override.aes = list(size=5))) +
    coord_cartesian(ylim=c(-35, 20), expand=TRUE) +
    scale_x_continuous(breaks = c(2004,2008,2012,2016)) +
    scale_y_continuous(breaks = c(-10, 0, 10), labels = function(x) formatC(x, format = "f", digits = 1)) +
    theme(plot.title = element_text(face="bold", size=40),
          axis.title.x = element_text(size=15, angle=0, face="bold"),
          axis.title.y = element_text(size=15, angle=90, face="bold"),
          axis.text.y = element_text(size=12, face="bold"),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.x = element_line(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_line(),
          legend.title=element_blank(),
          legend.position = "none",
          legend.text=element_text(size=9, face="bold"),
          legend.key = element_rect(colour = "black")
    )

# vmt demonstrate what's going on under the hood----
## get state-month means
demeaned.vmt[, statemonth_mean_tmax := mean(tmax, na.rm=T), by=.(statemonth)]
demeaned.vmt[, cut_mean_tmax := cut(statemonth_mean_tmax, breaks=c(-Inf,seq(0, 30, by=5),Inf))] # create factor variable

a <- foreach (i = c('(-Inf,0]', '(0,5]', '(5,10]', '(10,15]', '(15,20]', '(20,25]', '(25,30]', '(30, Inf]')) %do% {
return(
ggplot(data=demeaned.vmt[cut_mean_tmax==i]) +
  ggtitle(label=if (i=="(-Inf,0]") {
    "<0\u2103"
  } else if (i=="(30, Inf]") {
      ">30\u2103"
  } else {
      gsub("]", "\u2103", gsub(",", "-", gsub("\\(", "", i)))
    }
          ) +
  geom_point(aes(x=tmax.demean.all, y=vmt.demean.all), alpha=0.05, color="gray60", size=0.05) +
  geom_smooth(aes(x=tmax.demean.all, y=vmt.demean.all), method="lm", fill=NA, color="#FB8C00", size=1.5) +
  coord_cartesian(ylim=c(-0.1, 0.1), xlim=c(-5, 5), expand=TRUE) +
  scale_x_continuous(breaks=c(-5,0,5)) +
  theme(plot.title = element_text(face="bold", size=11, hjust=0.5),
          axis.title.x = element_blank(),
          axis.title.y = element_blank(),
          axis.text.y = element_blank(),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.y = element_blank(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_blank(),
          legend.title=element_blank(),
          legend.position = 'none',
          legend.text=element_text(size=15, face="bold"),
          legend.key = element_rect(colour = "black")
    )
)
}

justy.scatter.vmt <- ggplot(data=demeaned.vmt[cut_mean_tmax=='(0,5]']) +
  ylab("Log(VMT)\nAnomaly") +
  ggtitle('(0,5]') +
  geom_smooth(aes(x=tmax.demean.all, y=vmt.demean.all), method="lm", fill=NA, color="chocolate1") +
  geom_point(aes(x=tmax.demean.all, y=vmt.demean.all), alpha=0.05, color="gray60", size=0.5) +
  coord_cartesian(ylim=c(-0.1, 0.1), xlim=c(-5,5), expand=TRUE) +
  scale_y_continuous(breaks = c(-0.1, 0, 0.1)) +
  theme(plot.title = element_text(face="bold", size=25, hjust=0.5),
          axis.title.x = element_blank(),
          axis.title.y = element_text(size=15, face="bold"),
          axis.text.y = element_text(size=12, face="bold"),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.y = element_line(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_line(),
          legend.title=element_blank(),
          legend.position = 'none',
          legend.text=element_text(size=15, face="bold"),
          legend.key = element_rect(colour = "black")
    )
justy.scatter.vmt <- gtable_filter(ggplotGrob(justy.scatter.vmt), 'axis-l|ylab', trim=F)
justy.scatter.vmt <- ggdraw(justy.scatter.vmt)

# public transit trips----
upt.main.plot <- ggplot(upt.main[b!=0], aes(x=x, y=coef_norm, group=b)) + 
                  geom_hline(aes(yintercept=0), linetype=2, color='gray80') +
                  geom_line(color="#bae3ff", alpha=0.03, size=0.25) + 
                  geom_line(data=upt.main[b==0,], aes(x=x, y=coef_norm), color="#19A1FC", size=4, alpha=1) +
                  coord_cartesian(ylim=c(-0.32, 0.01)) +
                  scale_x_continuous(breaks = seq(-10,40,10)) +
                  ylab("Log(Public Transit Trips)") +
                  xlab("Monthly Average Maximum Temperature in \u2103") +
                  theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=20, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = 'none'
                  )
hist <- axis_canvas(upt.main.plot, axis = "x") + 
        geom_histogram(data = upt, aes(x = tmax), bins=30, fill="#19A1FC", color='#d5ed76', alpha=0.5, size=0.1)
upt.main.plot <- insert_xaxis_grob(upt.main.plot, hist, position = "bottom", height = grid::unit(0.075, "null"))
upt.main.plot <- ggdraw(upt.main.plot)

# upt deconvolved data----
upt[, uzamonth := as.factor(paste0(uza, month))]
upt[, uzayear := as.factor(paste0(uza, year))]
upt[, numyear := uniqueN(year), by=.(uza)]

de.upt <- as.data.table(demeanlist(mtx = list(upt$tmax, upt$log_trips), fl = list(upt$uzamonth, upt$uzayear)))
setnames(de.upt, c('tmax.demean.all','upt.demean.all'))
demeaned.upt <- cbind(upt, de.upt)
de.upt <- as.data.table(demeanlist(mtx = list(upt$tmax, upt$log_trips), fl = list(upt$uza)))
setnames(de.upt, c('tmax.demean.unitonly','upt.demean.unitonly'))
demeaned.upt <- cbind(demeaned.upt, de.upt)

# upt deconvolved log_trips----
p3 <- ggplot(demeaned.upt[uza==1], aes(x=monthyr)) + 
    ylab("Log(PTT)\nAnomaly") +
    xlab("Calendar Month") +
    geom_hline(yintercept = 0, linetype="dashed") +
    geom_line(aes(y=upt.demean.unitonly, color='#d5ed76'), size=0.25, alpha=1) +
    geom_line(aes(y=upt.demean.all, color="#19A1FC"), size=0.75, alpha=1) +
    scale_color_manual(limits=c("#d5ed76","#19A1FC"), values=c("#d5ed76","#19A1FC"), labels=c("City FE", "City-by-Month & City-by-Year FE")) +
    guides(color = guide_legend(override.aes = list(size=5))) +
    coord_cartesian(ylim=c(-0.35, 0.20), expand=TRUE) +
    scale_x_continuous(breaks = c(2004,2008,2012,2016)) +
    scale_y_continuous(breaks = c(-0.2, -0.1, 0, 0.1), labels = function(x) formatC(x, format = "f", digits = 2)) +
    theme(plot.title = element_text(face="bold", size=40),
          axis.title.x = element_text(size=15, angle=0, face="bold"),
          axis.title.y = element_text(size=15, angle=90, face="bold"),
          axis.text.y = element_text(size=12, face="bold"),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.x = element_line(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_line(),
          legend.title=element_blank(),
          legend.position = c(0.025,0.15),
          legend.text=element_text(size=9, face="bold"),
          legend.key = element_rect(colour = "black")
    )

# upt deconvolved tmax----
p4 <- ggplot(demeaned.upt[uza==1,], aes(x=monthyr)) + 
    ylab("Max. Temp.\nAnomaly") +
    xlab("Calendar Month") +
    geom_hline(yintercept = 0, linetype="dashed") +
    geom_line(aes(y=tmax.demean.unitonly, color='#d5ed76'), size=0.25, alpha=1) +
    geom_line(aes(y=tmax.demean.all, color="#19A1FC"), size=0.75, alpha=1) +
    scale_color_manual(limits=c("#d5ed76","#19A1FC"), values=c("#d5ed76","#19A1FC"), labels=c("City FE", "City-by-Month & City-by-Year FE")) +
    guides(color = guide_legend(override.aes = list(size=5))) +
    coord_cartesian(ylim=c(-35, 20), expand=TRUE) +
    scale_x_continuous(breaks = c(2004,2008,2012,2016)) +
    scale_y_continuous(breaks = c(-10, 0, 10), labels = function(x) formatC(x, format = "f", digits = 1)) +
    theme(plot.title = element_text(face="bold", size=40),
          axis.title.x = element_text(size=15, angle=0, face="bold"),
          axis.title.y = element_text(size=15, angle=90, face="bold"),
          axis.text.y = element_text(size=12, face="bold"),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.x = element_line(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_line(),
          legend.title=element_blank(),
          legend.position = "none",
          legend.text=element_text(size=9, face="bold"),
          legend.key = element_rect(colour = "black")
    )

# upt demonstrate what's going on under the hood----
## get state-month means
demeaned.upt[, uzamonth_mean_tmax := mean(tmax, na.rm=T), by=.(uzamonth)]
demeaned.upt[, cut_mean_tmax := cut(uzamonth_mean_tmax, breaks=c(-Inf,seq(0, 30, by=5),Inf))] # create factor variable

b <- foreach (i = c('(-Inf,0]', '(0,5]', '(5,10]', '(10,15]', '(15,20]', '(20,25]', '(25,30]', '(30, Inf]')) %do% {
return(
ggplot(data=demeaned.upt[cut_mean_tmax==i]) +
  ggtitle(label=if (i=="(-Inf,0]") {
    "<0\u2103"
  } else if (i=="(30, Inf]") {
      ">30\u2103"
  } else {
      gsub("]", "\u2103", gsub(",", "-", gsub("\\(", "", i)))
    }
          ) +
  geom_point(aes(x=tmax.demean.all, y=upt.demean.all), alpha=0.01, color="gray60", size=0.05) +
  geom_smooth(aes(x=tmax.demean.all, y=upt.demean.all), method="lm", fill=NA, color="#19A1FC", size=1.5) +
  coord_cartesian(ylim=c(-0.1, 0.1), xlim=c(-5, 5), expand=TRUE) +
  scale_y_continuous(breaks=c(-0.1,0,0.1)) +
  scale_x_continuous(breaks=c(-5,0,5)) +
  theme(plot.title = element_text(face="bold", size=11, hjust=0.5),
          axis.title.x = element_blank(),
          axis.title.y = element_blank(),
          axis.text.y = element_blank(),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.y = element_blank(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_blank(),
          legend.title=element_blank(),
          legend.position = 'none',
          legend.text=element_text(size=15, face="bold"),
          legend.key = element_rect(colour = "black")
    )
)
}

justy.scatter.upt <- ggplot(data=demeaned.upt[cut_mean_tmax=='(0,5]']) +
  ylab("Log(PTT)\nAnomaly") +
  ggtitle('(0,5]') +
  geom_smooth(aes(x=tmax.demean.all, y=upt.demean.all), method="lm", fill=NA, color="chocolate1") +
  geom_point(aes(x=tmax.demean.all, y=upt.demean.all), alpha=0.01, color="gray60", size=0.5) +
  coord_cartesian(ylim=c(-0.1, 0.1), xlim=c(-5,5), expand=TRUE) +
  scale_y_continuous(breaks = c(-0.1, 0, 0.1)) +
  theme(plot.title = element_text(face="bold", size=25, hjust=0.5),
          axis.title.x = element_blank(),
          axis.title.y = element_text(size=15, face="bold"),
          axis.text.y = element_text(size=12, face="bold"),
          axis.text.x = element_text(size=12, face="bold"),
          axis.ticks.y = element_line(),
          panel.grid.major = element_blank(),
          panel.grid.minor = element_blank(),
          panel.background = element_blank(),
          axis.line.x = element_line(),
          axis.line.y = element_line(),
          legend.title=element_blank(),
          legend.position = 'none',
          legend.text=element_text(size=15, face="bold"),
          legend.key = element_rect(colour = "black")
    )
justy.scatter.upt <- gtable_filter(ggplotGrob(justy.scatter.upt), 'axis-l|ylab', trim=F)
justy.scatter.upt <- ggdraw(justy.scatter.upt)

# plot----
cairo_pdf(file = './figure2.pdf', width=14, height=10)
grid.arrange(
  arrangeGrob(vmt.main.plot,
             arrangeGrob(arrangeGrob(rectGrob(gp=gpar(col="white")), justy.scatter.vmt, arrangeGrob(a[[1]], a[[2]], a[[3]], a[[4]], a[[5]], a[[6]], a[[7]], a[[8]], ncol=8), rectGrob(gp=gpar(col="white")), ncol=4, widths=c(0.1, 0.1, 0.7, 0.1)), textGrob("Max. Temp. Anomaly in \u2103", hjust=0.35, gp=gpar(fontsize=15, font=2)), nrow=2, heights=c(0.93, 0.07)), 
             arrangeGrob(p1, p2, ncol=2),
             nrow=3, 
             heights=c(0.5,0.25, 0.25)),
  arrangeGrob(upt.main.plot,
             arrangeGrob(arrangeGrob(rectGrob(gp=gpar(col="white")), justy.scatter.upt, arrangeGrob(b[[1]], b[[2]], b[[3]], b[[4]], b[[5]], b[[6]], b[[7]], b[[8]], ncol=8), rectGrob(gp=gpar(col="white")), ncol=4, widths=c(0.1, 0.1, 0.7, 0.1)), textGrob("Max. Temp. Anomaly in \u2103", hjust=0.35, gp=gpar(fontsize=15, font=2)), nrow=2, heights=c(0.93, 0.07)), 
             arrangeGrob(p3, p4, ncol=2),
             nrow=3, 
             heights=c(0.5,0.25, 0.25)),
  ncol=2
  )
grid.text("a", x=unit(0.07, "npc"), y=unit(0.98, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("b", x=unit(0.57, "npc"), y=unit(0.98, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("c", x=unit(0.12, "npc"), y=unit(0.32, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("d", x=unit(0.62, "npc"), y=unit(0.32, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("e", x=unit(0.09, "npc"), y=unit(0.225, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("f", x=unit(0.34, "npc"), y=unit(0.225, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("g", x=unit(0.59, "npc"), y=unit(0.225, "npc"), gp=gpar(fontsize=30,font=2))
grid.text("h", x=unit(0.84, "npc"), y=unit(0.225, "npc"), gp=gpar(fontsize=30,font=2))
dev.off()
```

```{r fig3, eval=FALSE}
# plot historical tmax histogram with indication of peak vmt----
dum <- data.frame(point = sort(rep(letters[1], 10)), x = runif(10), y = runif(10))
tmax.hist <- ggplot(data = vmt, aes(x = tmax)) +
                  geom_histogram(aes(y=..density..), fill='#d5ed76', color='gray40', alpha=0.8, bins=30) +
                  ylab("Density") + 
                  xlab("Max. Temp. in \u2103") +
                  geom_vline(xintercept = vmt.main[b==0 & coef_norm==max(coef_norm), x], color='#FB8C00', linetype=1, size=3) +
                  geom_line(data = dum, aes(color = point, x = x, y = y), alpha = 0, size=4) + 
                  scale_color_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     limits=c("#FB8C00"), 
                     values=c("#FB8C00"),
                     labels=c("Peak\nHistorical\nVMT")) +
                  coord_cartesian(ylim=c(0, 0.04), xlim=c(-12,40)) +
                  theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=20, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = c(.01, 0.7),
                        legend.text=element_text(size=12, face="bold"),
                        legend.key = element_rect(colour = "black", fill="white")
                  )

# plot time series of monthly gcm projections by state----
## calculate temperature anomalies relative to baseline
gcm_baseline <- state_nex[year(monthyr) >= 2006 & year(monthyr) < 2020, .(baseline_rcp26 = mean(tmax_rcp26),
                        baseline_rcp45 = mean(tmax_rcp45),
                        baseline_rcp85 = mean(tmax_rcp85)),
                    by=.(month(monthyr), state)]
state_nex[, month := month(monthyr)]
setkey(gcm_baseline, state, month)
setkey(state_nex, state, month)
gcm <- merge(state_nex, gcm_baseline)
gcm <- gcm[year(monthyr) >= 2020,] # keep only 2020+
gcm[, ':=' (rcp26_anom = tmax_rcp26 - baseline_rcp26,
            rcp45_anom = tmax_rcp45 - baseline_rcp45,
            rcp85_anom = tmax_rcp85 - baseline_rcp85)]

dum <- data.frame(point = sort(rep(letters[1:3], 10)), x = gcm$monthyr, y = runif(30))
gcm_plot <- ggplot(gcm, aes(x=monthyr, group=state)) + 
                  geom_hline(aes(yintercept=0), linetype=2, color='gray80') +
                  geom_line(aes(y=rcp26_anom), color='#d5ed76', size=0.05, alpha=0.3) + 
                  geom_line(aes(y=rcp45_anom), color='gray60', size=0.05, alpha=0.3) + 
                  geom_line(aes(y=rcp85_anom), color='#FB8C00', size=0.05, alpha=0.3) +
                  ylab("Max. Temp. Anomaly in \u2103") +
                  xlab("Calendar Month") +
                  geom_line(data = dum, aes(color = point, x = x, y = y, group=NULL), alpha = 0, size=4) + 
                  scale_color_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     limits=c("#d5ed76","gray60","#FB8C00"), 
                     values=c("#d5ed76","gray60","#FB8C00"),
                     labels=c("RCP2.6","RCP4.5","RCP8.5")) +
                  coord_cartesian(xlim=c(2020,2100)) +
                  theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=20, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = c(.01, 0.7),
                        legend.text=element_text(size=12, face="bold"),
                        legend.key = element_rect(colour = "black", fill="white")
                  )

# vehicle miles traveled by-state climate projection----
## vmt
# vmt_projection_spline <- function(data, dv, iv, lhs) {
#   ## keep spline knots with the fitted model results for plotting
#   spl_knots <- wtd.quantile(iv[!is.na(dv)], weights=dv[!is.na(dv)], probs = c(0.25, 0.5, 0.75), na.rm=T) # get quantiles of iv weighted by dv (cannot have missing values)
#   spl_boundary_knots <- range(iv, na.rm=T) # get range of independent variable
#   xs <- as.data.table(ns(iv, knots = spl_knots, Boundary.knots = spl_boundary_knots)) # get cubic splines of iv
#   setnames(xs, paste0("ns_", 1:ncol(xs))) # set names of splines
#   data <- cbind(data, xs)
#   ## set up felm model
#   rhs <- paste(c(names(xs), "trange", "prcp.days", "cloud", "humid", "wind"), collapse = " + ")
#   fe <- paste(c("state:month", "state:year"), collapse = " + ")
#   fmla <- as.formula(sprintf("%s ~ %s | %s | 0 | state", lhs, rhs, fe))
#   fit_felm <- felm(fmla, data=data)
#   result <- foreach::foreach (j = c('tmax_rcp26','tmax_rcp45','tmax_rcp85'), .combine = cbind) %do% {
#             pred_tmax <- state_nex[, get(j)] # prediction iv
#             xp <- as.data.table(ns(pred_tmax, knots = spl_knots, Boundary.knots = spl_boundary_knots))
#             result <- foreach::foreach (i = 1:nrow(xp), .combine = rbind) %do% {
#               this.x <- pred_tmax[i]
#               this.row <- c(unlist(xp[i, ]), 0, 0, 0, 0, 0) # include zeros for meteorological controls
#               temp.dt <- as.data.table(svycontrast(fit_felm, this.row))
#               temp.dt[, x := this.x]
#               setnames(temp.dt, c(paste0(j,"_coef"), paste0(j,"_se"), paste0(j,"_x")))
#               return(temp.dt)
#             }
#     return(result)
#   }
#   return(result)
# } # modified function from patrick baylis
# projection <- vmt_projection_spline(data=vmt, dv=vmt$log_vmt, iv=vmt$tmax, lhs="log_vmt")[, b:= 0] # main
# projection <- cbind(state_nex, projection)
# projection[, c('b', 'tmax_rcp85_x', 'tmax_rcp45_x', 'tmax_rcp26_x') := NULL] # remove unneeded vars
# projection[, ':=' (tmax_rcp26_coef = tmax_rcp26_coef - max(tmax_rcp26_coef),
#                    tmax_rcp45_coef = tmax_rcp45_coef - max(tmax_rcp45_coef),
#                    tmax_rcp85_coef = tmax_rcp85_coef - max(tmax_rcp85_coef))] # normalize so max==0
# projection[, tmax_rcp26_upr := tmax_rcp26_coef + 1.96*tmax_rcp26_se]
# projection[, tmax_rcp26_lwr := tmax_rcp26_coef - 1.96*tmax_rcp26_se]
# projection[, tmax_rcp26_se := NULL]
# projection[, tmax_rcp45_upr := tmax_rcp45_coef + 1.96*tmax_rcp45_se]
# projection[, tmax_rcp45_lwr := tmax_rcp45_coef - 1.96*tmax_rcp45_se]
# projection[, tmax_rcp45_se := NULL]
# projection[, tmax_rcp85_upr := tmax_rcp85_coef + 1.96*tmax_rcp85_se]
# projection[, tmax_rcp85_lwr := tmax_rcp85_coef - 1.96*tmax_rcp85_se]
# projection[, tmax_rcp85_se := NULL]
# saveRDS(projection, "./state_projection_bootstrap.rds")
# rm(projection)
state_projection <- readRDS("./state_projection_bootstrap.rds")

# calculate and plot by-month cumulative projection for rcp26/45/85----
## need to use 2006-2020 tmax and calculate differences by state then cumulate
mean2006_2020 <- state_projection[year(monthyr) >= 2006 & year(monthyr) < 2020, 
                                  .(base_rcp26_coef = mean(tmax_rcp26_coef),
                                    base_rcp45_coef = mean(tmax_rcp45_coef),
                                    base_rcp85_coef = mean(tmax_rcp85_coef)), 
                                  by=.(month(monthyr), state)]
state_projection[, month := month(monthyr)]
setkey(state_projection, state, month)
setkey(mean2006_2020, state, month)
state_projection <- merge(state_projection, mean2006_2020)
rm(mean2006_2020)

## get baseline vmt (millions) by state for 2013-2017 (most recent five-year period)
baseline_vmt <- vmt[year(monthyr) >= 2013 & year(monthyr) <= 2017, .(vmt = mean(vmt, na.rm=T)), by=.(state, month(monthyr))]
setkey(baseline_vmt, state, month)
setkey(state_projection, state, month)
state_projection <- merge(state_projection, baseline_vmt)
rm(baseline_vmt)

## calculate changes relative to baseline
## exponentiate coefficients to get percentage change, convert to vmt, difference from fixed baseline
state_projection[, ':=' (delta_rcp26 = exp(tmax_rcp26_coef)*vmt - exp(base_rcp26_coef)*vmt,
                         delta_rcp45 = exp(tmax_rcp45_coef)*vmt - exp(base_rcp45_coef)*vmt,
                         delta_rcp85 = exp(tmax_rcp85_coef)*vmt - exp(base_rcp85_coef)*vmt)] 

## keep only year 2020+
state_projection <- state_projection[year(monthyr) >= 2020]

## cumulate changes by state across month-years
setkey(state_projection, state, monthyr)
state_cumul_changes <- state_projection[, .(cumul_rcp26 = cumsum(delta_rcp26),
                                            cumul_rcp45 = cumsum(delta_rcp45),
                                            cumul_rcp85 = cumsum(delta_rcp85),
                                          monthyr = unique(monthyr)), by=.(state)]

## cumulate changes by full us
us_cumul_changes <- state_cumul_changes[, .(cumul_rcp26 = sum(cumul_rcp26),
                                            cumul_rcp45 = sum(cumul_rcp45),
                                            cumul_rcp85 = sum(cumul_rcp85)), 
                                        by=.(monthyr)]

dum <- data.frame(point = sort(rep(letters[1:3], 10)), x = us_cumul_changes$monthyr, y = runif(30))
us_cumul_plot <- ggplot(us_cumul_changes, aes(x=monthyr)) + 
                  geom_hline(aes(yintercept=0), linetype=2, color='gray80') +
                  geom_line(aes(y=cumul_rcp26/1000), color='#d5ed76', size=3) + 
                  geom_line(aes(y=cumul_rcp45/1000), color='gray60', size=3) + 
                  geom_line(aes(y=cumul_rcp85/1000), color='#FB8C00', size=3) +
                  ylab("Cumulative Added VMT in Billions of Miles") +
                  xlab("Calendar Month") +
                  geom_line(data = dum, aes(color = point, x = x, y = y), alpha = 0, size=4) + 
                  scale_color_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     limits=c("#d5ed76","gray60","#FB8C00"), 
                     values=c("#d5ed76","gray60","#FB8C00"),
                     labels=c("RCP2.6","RCP4.5","RCP8.5")) +
                  coord_cartesian(ylim=c(0,1250)) +
                  scale_y_continuous(breaks=c(0, 250, 500, 750, 1000, 1250)) +
                  theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=25, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = c(.2, 0.7),
                        legend.text=element_text(size=17, face="bold"),
                        legend.key = element_rect(colour = "black", fill="white")
                  )

# calculate and plot 2040-2050 and 2090-2100 by-month projections----
month20402050 <- state_projection[year(monthyr) >= 2040 & year(monthyr) <= 2050, 
                                 .(delta_rcp26 = sum(delta_rcp26),
                                   delta_rcp45 = sum(delta_rcp45),
                                   delta_rcp85 = sum(delta_rcp85)),
                                 by=.(month)
                                 ]
month20402050 <- melt(month20402050, id.vars = 'month', variable.name = "rcp", value.name = 'vmt')

month20902100 <- state_projection[year(monthyr) >= 2090 & year(monthyr) <= 2100, 
                                 .(delta_rcp26 = sum(delta_rcp26),
                                   delta_rcp45 = sum(delta_rcp45),
                                   delta_rcp85 = sum(delta_rcp85)),
                                 by=.(month)
                                 ]
month20902100 <- melt(month20902100, id.vars = 'month', variable.name = "rcp", value.name = 'vmt')

## plot
rcp.colors <- c(delta_rcp26 = "#d5ed76", delta_rcp45 = "gray60", delta_rcp85 ="#FB8C00") # set colors

month20402050.plot <- ggplot(data=month20402050) +
    geom_hline(yintercept = 0, color="gray60") +
    geom_col(aes(x=month, y=vmt/1000, width=0.8, fill=factor(rcp)), color="gray60", position = "dodge") +
    coord_cartesian(ylim = c(-25,50), xlim = c(0.5,12.5)) +
    scale_x_continuous(breaks=c(1:12), labels=c('J','F','M','A','M','J','J','A','S','O','N','D')) +
    ylab("Added VMT in Billions") +
    xlab("Months, 2040-2050") +
    scale_fill_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     values=rcp.colors,
                     labels=c("RCP2.6","RCP4.5","RCP8.5")) +
    theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_blank(),
                        axis.text.y = element_blank(),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_blank(),
                        axis.ticks.y = element_blank(),
                        legend.title=element_blank(),
                        legend.position = c(.7, 0.2),
                        legend.text=element_text(size=12, face="bold"),
                        legend.key = element_rect(colour = "black", fill="white")
                  )

month209002100.plot <- ggplot(data=month20902100) +
    geom_hline(yintercept = 0, color="gray60") +
    geom_col(aes(x=month, y=vmt/1000, width=0.8, fill=factor(rcp)), color="gray60", position = "dodge") +
    coord_cartesian(ylim = c(-25,50), xlim = c(0.5,12.5)) +
    scale_x_continuous(breaks=c(1:12), labels=c('J','F','M','A','M','J','J','A','S','O','N','D')) +
    ylab("Added VMT in Billions") +
    xlab("Months, 2090-2100") +
    scale_fill_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     values=rcp.colors,
                     labels=c("RCP2.6","RCP4.5","RCP8.5")) +
    theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_blank(),
                        axis.text.y = element_blank(),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_blank(),
                        axis.ticks.y = element_blank(),
                        legend.title=element_blank(),
                        legend.position = "none"
                  )

justy.month <- ggplot(data=month20902100) +
    geom_hline(yintercept = 0, color="gray60") +
    geom_col(aes(x=month, y=vmt/1000, width=0.8, fill=factor(rcp)), color="gray60", position = "dodge") +
    coord_cartesian(ylim = c(-25,50), xlim = c(0.5,12.5)) +
    scale_x_continuous(breaks=c(1:12), labels=c('J','F','M','A','M','J','J','A','S','O','N','D')) +
    ylab("Added VMT in Billions") +
    xlab("Months, 2090-2100") +
    scale_fill_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     values=rcp.colors,
                     labels=c("RCP2.6","RCP4.5","RCP8.5")) +
    theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=20, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = c(.05, 0.2),
                        legend.text=element_text(size=16, face="bold"),
                        legend.key = element_rect(colour = "black", fill="white")
                  )
justy.month <- gtable_filter(ggplotGrob(justy.month), 'axis-l|ylab', trim=F)
justy.month <- ggdraw(justy.month)

# calculate and plot 2040-2050 and 2090-2100 by-state cumulative map projections----
state_projection <- readRDS("./state_projection_bootstrap.rds")

## need to use 2006-2020 tmax baseline and calculate differences by state then cumulate
mean2006_2020 <- state_projection[year(monthyr) >= 2006 & year(monthyr) < 2020, 
                                  .(base_rcp26_coef = mean(tmax_rcp26_coef),
                                    base_rcp45_coef = mean(tmax_rcp45_coef),
                                    base_rcp85_coef = mean(tmax_rcp85_coef)), 
                                  by=.(month(monthyr), state)]
state_projection[, month := month(monthyr)]
setkey(state_projection, state, month)
setkey(mean2006_2020, state, month)
state_projection <- merge(state_projection, mean2006_2020)
rm(mean2006_2020)

## get baseline vmt (millions) by state for 2013-2017 (most recent full five-year period)
baseline_vmt <- vmt[year(monthyr) >= 2013 & year(monthyr) <= 2017, .(vmt = mean(vmt, na.rm=T)), by=.(state, month(monthyr))]
setkey(baseline_vmt, state, month)
setkey(state_projection, state, month)
state_projection <- merge(state_projection, baseline_vmt)
rm(baseline_vmt)

## calculate changes relative to baseline
## exponentiate coefficients to get percentage change, convert to vmt, difference from fixed baseline
state_projection[, ':=' (delta_rcp26 = exp(tmax_rcp26_coef)*vmt - exp(base_rcp26_coef)*vmt,
                         delta_rcp45 = exp(tmax_rcp45_coef)*vmt - exp(base_rcp45_coef)*vmt,
                         delta_rcp85 = exp(tmax_rcp85_coef)*vmt - exp(base_rcp85_coef)*vmt)] 

## keep only year 2020+
state_projection <- state_projection[year(monthyr) >= 2020]

## cumulate changes by state across month-years
setkey(state_projection, state, monthyr)
state_cumul_changes <- state_projection[, .(cumul_rcp26 = cumsum(delta_rcp26),
                                            cumul_rcp45 = cumsum(delta_rcp45),
                                            cumul_rcp85 = cumsum(delta_rcp85),
                                          monthyr = unique(monthyr)), by=.(state)]

## mean of cumulative changes by state in 2050, 2099 by model
state20502099 <- state_cumul_changes[year(monthyr)==2050 | year(monthyr)==2099, 
                                     .(cumul_rcp26 = mean(cumul_rcp26),
                                       cumul_rcp45 = mean(cumul_rcp45),
                                       cumul_rcp85 = mean(cumul_rcp85)),
                                     by=.(year(monthyr), state)]

## plot
## vmt
names(states)[6] <- "state" # fix names
states$state <- as.character(states$state)
cumul.vmt.2050 <- state20502099[year==2050, 
                                .(cumul_rcp26 = log(cumul_rcp26*1000000),
                                 cumul_rcp45 = log(cumul_rcp45*1000000),
                                 cumul_rcp85 = log(cumul_rcp85*1000000)
                                     ), by=.(state)]
cumul.vmt.2050[, state := as.character(state)]
cumul.vmt.2050 <- sp::merge(states, cumul.vmt.2050, by="state")

cumul.vmt.2099 <- state20502099[year==2099, 
                                .(cumul_rcp26 = log(cumul_rcp26*1000000),
                                 cumul_rcp45 = log(cumul_rcp45*1000000),
                                 cumul_rcp85 = log(cumul_rcp85*1000000)
                                     ), by=.(state)]
cumul.vmt.2099[, state := as.character(state)]
cumul.vmt.2099 <- sp::merge(states, cumul.vmt.2099, by="state")

p1 <- spplot(
       obj=cumul.vmt.2050,
       zcol=c("cumul_rcp26","cumul_rcp45","cumul_rcp85"),
       strip=FALSE,
       col.regions=colorRampPalette(c("#ffdbaf", "#ffce91", "#FB8C00", "#e27e00", "#c16b01","#9e3f00"))(100), # set colors
       as.table = TRUE,
       col=NA,
       xlab=NULL, # kill x-axis
       ylab=NULL, # kill y-axis
       scales=list(draw=FALSE),
       margin=FALSE, # remove margin padding
       colorkey=list(space="bottom", labels=list(cex=1.2), height=0.5), # make legend on bottom of plot
       at=seq(min(cumul.vmt.2050$cumul_rcp26, na.rm=TRUE), max(cumul.vmt.2099$cumul_rcp85, na.rm=TRUE)+0.000001, length.out=100),
       par.settings = plasmaTheme(),
       sub=list("log(Cumulative Added Vehicle Miles Traveled)", cex=1.4),
       layout=c(3,1) # horizontal layout
       )
p1 <- update(p1, par.settings = list(axis.line = list(col = "transparent"),
                                   par.main.text=list(cex=2),
                                   panel.background=list(col="transparent")))
p1 <- p1 + layer_(sp.lines(obj=us, fill="white", col="black"))
p1 <- p1 + layer(sp.lines(obj=states,
                              fill="transparent",
                              col="black",
                              cex=0.1))

p2 <- spplot(
       obj=cumul.vmt.2099,
       zcol=c("cumul_rcp26","cumul_rcp45","cumul_rcp85"),
       strip=FALSE,
       col.regions=colorRampPalette(c("#ffdbaf", "#ffce91", "#FB8C00", "#e27e00", "#c16b01","#9e3f00"))(100), # set colors
       as.table = TRUE,
       col=NA,
       xlab=NULL, # kill x-axis
       ylab=NULL, # kill y-axis
       scales=list(draw=FALSE),
       margin=FALSE, # remove margin padding
       colorkey=list(space="bottom", labels=list(cex=1.2), height=0.5), # make legend on bottom of plot
       at=seq(min(cumul.vmt.2050$cumul_rcp26, na.rm=TRUE), max(cumul.vmt.2099$cumul_rcp85, na.rm=TRUE)+0.000001, length.out=100),
       par.settings = plasmaTheme(),
       sub=list("log(Cumulative Added Vehicle Miles Traveled)", cex=1.4),
       layout=c(3,1) # horizontal layout
       )
p2 <- update(p2, par.settings = list(axis.line = list(col = "transparent"),
                                   par.main.text=list(cex=2),
                                   panel.background=list(col="transparent")))
p2 <- p2 + layer_(sp.lines(obj=us, fill="white", col="black"))
p2 <- p2 + layer(sp.lines(obj=states,
                              fill="transparent",
                              col="black",
                              cex=0.1))

## get legend for plotting
key <- draw.colorkey(p1$legend[[1]]$args$key)
p1$legend <- NULL
p2$legend <- NULL
p1$sub <- NULL
p2$sub <- NULL

# plot----
cairo_pdf(filename = './figure3.pdf', width=14, height=15)
grid.arrange(arrangeGrob(rectGrob(gp=gpar(col="white")), 
             p1, 
             p2, 
             key, 
             textGrob("log(Cumulative Added Vehicle Miles Traveled)", gp=gpar(fontsize=20, font=2)),
             nrow=5, 
             heights=c(0.02, 0.42, 0.42, 0.08, 0.06)),
             rectGrob(gp=gpar(col="white")),
             arrangeGrob(us_cumul_plot,
                         arrangeGrob(arrangeGrob(tmax.hist, gcm_plot, ncol=2),
                                     arrangeGrob(rectGrob(gp=gpar(col="white")), 
                                                 justy.month, 
                                                 month20402050.plot, 
                                                 month209002100.plot, 
                                                 ncol=4, 
                                                 widths=c(0.035, 0.065, 0.45, 0.45)),
                                     nrow=2),
                         ncol=2),
             nrow=3, heights=c(0.49, 0.02, 0.49))
grid.text("a", x=unit(0.02, "npc"), y=unit(0.98, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("b", x=unit(0.08, "npc"), y=unit(0.48, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("c", x=unit(0.575, "npc"), y=unit(0.48, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("d", x=unit(0.8175, "npc"), y=unit(0.48, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("e", x=unit(0.57, "npc"), y=unit(0.06, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("f", x=unit(0.795, "npc"), y=unit(0.06, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("2050", x=unit(0.5, "npc"), y=unit(0.98, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("2099", x=unit(0.5, "npc"), y=unit(0.775, "npc"), gp=gpar(fontsize=30, font=2))
grid.text("RCP2.6", x=unit(0.1675, "npc"), y=unit(0.9, "npc"), gp=gpar(fontsize=30, font=2, col="white"))
grid.text("RCP2.6", x=unit(0.1675, "npc"), y=unit(0.6925, "npc"), gp=gpar(fontsize=30, font=2, col="white"))
grid.text("RCP4.5", x=unit(0.4922, "npc"), y=unit(0.9, "npc"), gp=gpar(fontsize=30, font=2, col="white"))
grid.text("RCP4.5", x=unit(0.4922, "npc"), y=unit(0.6925, "npc"), gp=gpar(fontsize=30, font=2, col="white"))
grid.text("RCP8.5", x=unit(0.815, "npc"), y=unit(0.9, "npc"), gp=gpar(fontsize=30, font=2, col="white"))
grid.text("RCP8.5", x=unit(0.815, "npc"), y=unit(0.6925, "npc"), gp=gpar(fontsize=30, font=2, col="white"))
dev.off()
```

```{r fig4, eval=FALSE}
# vehicle miles traveled by-state climate projection----
## vmt
# vmt_projection_spline <- function(data, dv, iv, lhs) {
#   ## keep spline knots with the fitted model results for plotting
#   spl_knots <- wtd.quantile(iv[!is.na(dv)], weights=dv[!is.na(dv)], probs = c(0.25, 0.5, 0.75), na.rm=T) # get quantiles of iv weighted by dv (cannot have missing values)
#   spl_boundary_knots <- range(iv, na.rm=T) # get range of independent variable
#   xs <- as.data.table(ns(iv, knots = spl_knots, Boundary.knots = spl_boundary_knots)) # get cubic splines of iv
#   setnames(xs, paste0("ns_", 1:ncol(xs))) # set names of splines
#   data <- cbind(data, xs)
#   ## set up felm model
#   rhs <- paste(c(names(xs), "trange", "prcp.days", "cloud", "humid", "wind"), collapse = " + ")
#   fe <- paste(c("state:month", "state:year"), collapse = " + ")
#   fmla <- as.formula(sprintf("%s ~ %s | %s | 0 | state", lhs, rhs, fe))
#   fit_felm <- felm(fmla, data=data)
#   result <- foreach::foreach (j = c('tmax_rcp26','tmax_rcp45','tmax_rcp85'), .combine = cbind) %do% {
#             pred_tmax <- state_nex[, get(j)] # prediction iv
#             xp <- as.data.table(ns(pred_tmax, knots = spl_knots, Boundary.knots = spl_boundary_knots))
#             result <- foreach::foreach (i = 1:nrow(xp), .combine = rbind) %do% {
#               this.x <- pred_tmax[i]
#               this.row <- c(unlist(xp[i, ]), 0, 0, 0, 0, 0) # include zeros for meteorological controls
#               temp.dt <- as.data.table(svycontrast(fit_felm, this.row))
#               temp.dt[, x := this.x]
#               setnames(temp.dt, c(paste0(j,"_coef"), paste0(j,"_se"), paste0(j,"_x")))
#               return(temp.dt)
#             }
#     return(result)
#   }
#   return(result)
# } # modified function from patrick baylis
# projection <- vmt_projection_spline(data=vmt, dv=vmt$log_vmt, iv=vmt$tmax, lhs="log_vmt")[, b:= 0] # main
# projection <- cbind(state_nex, projection)
# projection[, c('b', 'tmax_rcp85_x', 'tmax_rcp45_x', 'tmax_rcp26_x') := NULL] # remove unneeded vars
# projection[, ':=' (tmax_rcp26_coef = tmax_rcp26_coef - max(tmax_rcp26_coef),
#                    tmax_rcp45_coef = tmax_rcp45_coef - max(tmax_rcp45_coef),
#                    tmax_rcp85_coef = tmax_rcp85_coef - max(tmax_rcp85_coef))] # normalize so max==0
# projection[, tmax_rcp26_upr := tmax_rcp26_coef + 1.96*tmax_rcp26_se]
# projection[, tmax_rcp26_lwr := tmax_rcp26_coef - 1.96*tmax_rcp26_se]
# projection[, tmax_rcp26_se := NULL]
# projection[, tmax_rcp45_upr := tmax_rcp45_coef + 1.96*tmax_rcp45_se]
# projection[, tmax_rcp45_lwr := tmax_rcp45_coef - 1.96*tmax_rcp45_se]
# projection[, tmax_rcp45_se := NULL]
# projection[, tmax_rcp85_upr := tmax_rcp85_coef + 1.96*tmax_rcp85_se]
# projection[, tmax_rcp85_lwr := tmax_rcp85_coef - 1.96*tmax_rcp85_se]
# projection[, tmax_rcp85_se := NULL]
# saveRDS(projection, "./state_projection_bootstrap.rds")
# rm(projection)
state_projection <- readRDS("./state_projection_bootstrap.rds")

# calculate and plot by-month cumulative projection for rcp26/45/85----
## need to use 2006-2020 tmax baseline and calculate differences by state then cumulate
mean2006_2020 <- state_projection[year(monthyr) >= 2006 & year(monthyr) < 2020, 
                                  .(base_rcp26_coef = mean(tmax_rcp26_coef),
                                    base_rcp45_coef = mean(tmax_rcp45_coef),
                                    base_rcp85_coef = mean(tmax_rcp85_coef)), 
                                  by=.(month(monthyr), state)]
state_projection[, month := month(monthyr)]
setkey(state_projection, state, month)
setkey(mean2006_2020, state, month)
state_projection <- merge(state_projection, mean2006_2020)
rm(mean2006_2020)

## get baseline vmt (millions) by state for 2013-2017 (most recent full five-year period)
baseline_vmt <- vmt[year(monthyr) >= 2013 & year(monthyr) <= 2017, .(vmt = mean(vmt, na.rm=T)), by=.(state, month(monthyr))]
setkey(baseline_vmt, state, month)
setkey(state_projection, state, month)
state_projection <- merge(state_projection, baseline_vmt)
rm(baseline_vmt)

## calculate changes relative to baseline
## exponentiate coefficients to get percentage change, difference from fixed baseline
state_projection[, ':=' (delta_rcp26 = 100*(exp(tmax_rcp26_coef) - exp(base_rcp26_coef)),
                         delta_rcp45 = 100*(exp(tmax_rcp45_coef) - exp(base_rcp45_coef)),
                         delta_rcp85 = 100*(exp(tmax_rcp85_coef) - exp(base_rcp85_coef)))] 

## keep only year 2020+
state_projection <- state_projection[year(monthyr) >= 2020]

## get by-month
month20402050 <- state_projection[year(monthyr) >= 2040 & year(monthyr) <= 2050, 
                                 .(delta_rcp26 = mean(delta_rcp26, na.rm=T),
                                   delta_rcp45 = mean(delta_rcp45, na.rm=T),
                                   delta_rcp85 = mean(delta_rcp85, na.rm=T)),
                                 by=.(month, state=as.character(state))
                                 ]
month20402050 <- melt(month20402050, id.vars = c('month', 'state'), variable.name = "rcp", value.name = 'vmt')

month20902100 <- state_projection[year(monthyr) >= 2090 & year(monthyr) <= 2100, 
                                 .(delta_rcp26 = mean(delta_rcp26, na.rm=T),
                                   delta_rcp45 = mean(delta_rcp45, na.rm=T),
                                   delta_rcp85 = mean(delta_rcp85, na.rm=T)),
                                 by=.(month, state=as.character(state))
                                 ]
month20902100 <- melt(month20902100, id.vars = c('month','state'), variable.name = "rcp", value.name = 'vmt')

## plot
names(states)[6] <- "state" # fix names
states$state <- as.character(states$state)
month2050rcp26 <- month20402050[rcp=="delta_rcp26", .(vmt), by=.(month, state)]
month2099rcp26 <- month20902100[rcp=="delta_rcp26", .(vmt), by=.(month, state)]
month2050rcp26 <- dcast(month2050rcp26, state ~ paste0('month',month), value.var="vmt")
month2099rcp26 <- dcast(month2099rcp26, state ~ paste0('month',month),, value.var="vmt")
month2050rcp26 <- sp::merge(states, month2050rcp26, by="state", duplicateGeoms=TRUE)
month2099rcp26 <- sp::merge(states, month2099rcp26, by="state", duplicateGeoms=TRUE)

month2050rcp45 <- month20402050[rcp=="delta_rcp45", .(vmt), by=.(month, state)]
month2099rcp45 <- month20902100[rcp=="delta_rcp45", .(vmt), by=.(month, state)]
month2050rcp45 <- dcast(month2050rcp45, state ~ paste0('month',month),, value.var="vmt")
month2099rcp45 <- dcast(month2099rcp45, state ~ paste0('month',month),, value.var="vmt")
month2050rcp45 <- sp::merge(states, month2050rcp45, by="state", duplicateGeoms=TRUE)
month2099rcp45 <- sp::merge(states, month2099rcp45, by="state", duplicateGeoms=TRUE)

month2050rcp85 <- month20402050[rcp=="delta_rcp85", .(vmt), by=.(month, state)]
month2099rcp85 <- month20902100[rcp=="delta_rcp85", .(vmt), by=.(month, state)]
month2050rcp85 <- dcast(month2050rcp85, state ~ paste0('month',month),, value.var="vmt")
month2099rcp85 <- dcast(month2099rcp85, state ~ paste0('month',month),, value.var="vmt")
month2050rcp85 <- sp::merge(states, month2050rcp85, by="state", duplicateGeoms=TRUE)
month2099rcp85 <- sp::merge(states, month2099rcp85, by="state", duplicateGeoms=TRUE)

# rcp85----
p1 <- spplot(
       obj=month2050rcp85,
       zcol=c('month1','month2','month3','month4','month5','month6','month7','month8','month9','month10','month11','month12'),         
       names.attr = paste(c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October","November","December")), # set names for strip text
       strip=TRUE, # include strip text
       col.regions=colorRampPalette(c("#bcd168", "#d5ed76", "#ffc67f", "#FB8C00", "#e27e00", "#9e3f00"))(100), # set colors
       as.table = TRUE,
       # col=NA,
       xlab=NULL, # kill x-axis
       ylab=NULL, # kill y-axis
       scales=list(draw=FALSE),
       margin=FALSE, # remove margin padding
       colorkey=list(space="bottom", labels=list(cex=1.2), height=0.5), # make legend on bottom of plot
       at=seq(min(c(month2099rcp85$month6,month2099rcp85$month7,month2099rcp85$month8,month2099rcp85$month9), na.rm=TRUE), max(c(month2099rcp85$month11,month2099rcp85$month12,month2099rcp85$month1,month2099rcp85$month2), na.rm=TRUE)+0.000001, length.out=100),
       par.settings = plasmaTheme(),
       layout=c(4,3) # horizontal layout
       )
p1 <- update(p1, par.settings = list(axis.line = list(col = "transparent"),
                                   par.main.text=list(cex=2),
                                   panel.background=list(col="transparent")))
p1 <- p1 + layer_(sp.lines(obj=us, fill="white", col="black"))
p1 <- p1 + layer(sp.lines(obj=states,
                              fill="transparent",
                              col="black",
                              cex=0.1))

p2 <- spplot(
       obj=month2099rcp85,
       zcol=c('month1','month2','month3','month4','month5','month6','month7','month8','month9','month10','month11','month12'),         
       names.attr = paste(c("January", "February", "March", "April", "May", "June", "July", "August", "September", "October","November","December")), # set names for strip text
       strip=TRUE, # include strip text
       col.regions=colorRampPalette(c("#bcd168", "#d5ed76", "#ffc67f", "#FB8C00", "#e27e00", "#9e3f00"))(100), # set colors
       as.table = TRUE,
       # col=NA,
       xlab=NULL, # kill x-axis
       ylab=NULL, # kill y-axis
       scales=list(draw=FALSE),
       margin=FALSE, # remove margin padding
       colorkey=list(space="bottom", labels=list(cex=1.2), height=0.5), # make legend on bottom of plot
       at=seq(min(c(month2099rcp85$month6,month2099rcp85$month7,month2099rcp85$month8,month2099rcp85$month9), na.rm=TRUE), max(c(month2099rcp85$month11,month2099rcp85$month12,month2099rcp85$month1,month2099rcp85$month2), na.rm=TRUE)+0.000001, length.out=100),
       par.settings = plasmaTheme(),
       layout=c(4,3) # horizontal layout
       )
p2 <- update(p2, par.settings = list(axis.line = list(col = "transparent"),
                                   par.main.text=list(cex=2),
                                   panel.background=list(col="transparent")))
p2 <- p2 + layer_(sp.lines(obj=us, fill="white", col="black"))
p2 <- p2 + layer(sp.lines(obj=states,
                              fill="transparent",
                              col="black",
                              cex=0.1))

## get legend for plotting
key <- draw.colorkey(p1$legend[[1]]$args$key)
p1$legend <- NULL
p2$legend <- NULL
p1$sub <- NULL
p2$sub <- NULL

# plot----
cairo_pdf(filename = './figure4.pdf', width=9.75, height=11)
grid.arrange(rectGrob(gp=gpar(col="white")), 
             p1, 
             p2, 
             key, 
             textGrob("Percentage Change in VMT, RCP8.5", gp=gpar(fontsize=23, font=2)),
             nrow=5, 
             heights=c(0.02, 0.43, 0.43, 0.07, 0.05))
grid.text("2040-2050", x=unit(0.5, "npc"), y=unit(0.98, "npc"), gp=gpar(fontsize=21, font=2))
grid.text("2090-2100", x=unit(0.5, "npc"), y=unit(0.545, "npc"), gp=gpar(fontsize=21, font=2))
dev.off()
```

\begin{center}
\vspace{-1ex}
\textit{abbreviations}: vehicle miles traveled (VMT), public transit trips (PTT)
\end{center}

Human activities are changing the global climate [@stocker2013climate] at historically rapid rates [@snyder2016evolution]. And these climatic changes, in turn, are shaping the human system [@carleton2016social]. From psychological health [@obradovich2018empirical] and expectations [@moore2019rapidly] to physiological well-being [@obradovich2017nighttime;@obradovich2018sleep], from exercise [@obradovich2017climate] to mood [@baylis2018weather], from daily productivity [@zhang2018prod] to economy-wide growth [@burke2018large], and from public safety [@ranson2014crime;@obradovich2018effects] to civil discord [@hsiang2013quantifying;@obradovich2017demo], warming stands to substantially alter human outcomes. Though potential feedback loops exist [@lashof1989dynamic;@agrawala2007climate;@palmer2014model;@beckage2018linking], scholars predominantly investigate these two topics -- human contributions to climate change and the social impacts of a changing climate -- in isolation [@calvin2018integrated]. How the impacts of climate change on human systems might, in turn, alter the future trajectories of anthropogenic climate change remains largely unexplored.

In contrast, scientists have uncovered potential self-reinforcing loops between the climate system and biophysical earth systems [@arora2013carbon], including those provided by melting permafrost, changes in surface albedo, persistent loss of forest cover, and alterations in precipitation regimes, among numerous other feedbacks [@lenton2008tipping;@lenton2011early;@van2015causal;@hayhoe2016surprises]. Though these factors add substantial complexity when incorporated into climate models, they stand to markedly improve projections of future climatic outcomes [@bonan2018climate]. Given that humans are the primary contributors to present climatic changes [@stocker2013climate], omission of potential temperature-human behavior feedback from these models might bias projections of future climate even more than omission of biophysical feedback [@jarvis2012climate;@beckage2018linking;@woodard2018economic].

Of potential temperature-behavior feedback loops, human transportation behaviors are of particular import due to the large contribution of transportation to global greenhouse gas emissions [@schafer1999global;@unger2010attribution;@creutzig2015transport]. If climatic factors alter transportation usage, changes in climate may produce positive or negative feedback via the transportation sector. Recent studies provide rationale to hypothesize that climate change could alter transportation demand. For example, individuals increase their recreational physical activity in warmer temperatures [@brownson2001environmental;@obradovich2017climate], and existing evidence suggests that weather might shape transportation behaviors [@liu2015investigating;@liu2016estimating]. If climatic changes were to translate into an increased demand for travel, perhaps to see friends or to participate in recreational activities, warmer temperatures might increase demand for transport [@bocker2013climate;@clarke2015impact]. However, literature on the impacts of climate change and transportation has focused predominantly on the potential for extreme climatic impacts to periodically disrupt transportation systems [@koetse2009impact;@bocker2013impact] without analysis of possible feedback loops.

These facts combine to suggest that warming has the potential to induce changes in transportation behaviors that in turn alter the accumulation of greenhouse gases in Earth's atmosphere, producing a vicious circle in the earth system. To examine this hypothesis, we measure the effect of temperature on US transportation patterns across `r round(sum(vmt$vmt*1000000, na.rm=T)/1000000000000)` trillion vehicle miles traveled (VMT) and `r round(sum(upt$trips, na.rm=T)/1000000000)` billion individual trips on public transit (PTT) between 2002 and 2018 (see `r fig_num("data.plot", display="cite")`). Using these data and methods drawn from climate econometrics [@hsiang2016econometrics], we examine four primary questions.

```{r}
data.plot <- fig_num(name = "data.plot", caption = "")
```

```{r fig.data, out.width="100%", fig.cap=paste0(fig_num("data.plot", display="cite"),". \\textbf{Vehicle miles traveled and public transit trips.} This figure depicts the US states covered by our sources of vehicle miles traveled (VMT) and passenger public transit trips (PTT) as well as the national monthly variation in each series. Panel a plots the cross-sectional state variation of the over 40 trillion VMT in our data. Panel b plots the state variation of the over 150 billion PTT in our data. For PTT, we map metropolitan areas into the states where the centroid of the metropolitan boundary area falls. The US road network is depicted in a-b. Panel c displays the seasonality of VMT and illustrates a slight increase in VMT between 2003 and 2018. Panel d depicts that PTT display less seasonality in their temporal variation than do VMT.")}
knitr::include_graphics("./figure1.pdf")
```

First, does ambient temperature alter the use of vehicular transportation, the largest emitter in the transport sector [@fuglestvedt2008climate]? Second, does temperature also affect use of public transit? Third, do the effects of temperature on transportation vary by demographic or climatic context? Finally, how might anthropogenic climate change alter US transportation usage in the future?

To investigate the impact of temperature on vehicular transportation, we employ monthly, state-level data on the log of vehicle miles traveled (VMT) from the US Federal Highway Administration and combine these data with gridded monthly maximum temperature data from the PRISM climate group (see *Methods: Data*). Our data contain over 40 trillion VMT. We identify the causal effect of average maximum monthly temperatures using techniques from climate econometrics [@hsiang2016econometrics]. Our approach deconvolves the maximum temperature and vehicle miles traveled series into anomalies from normal for each state, month, and year. We graphically depict this process for our VMT data in `r fig_num("main", display="cite")`e-f. Once deconvolved, the remaining variation in temperature is as good as randomly assigned to the remaining variation in VMT; to bias our estimation, a confounding series would need to systematically co-vary with both temperature anomalies and VMT anomalies but not itself be caused by temperature anomalies [@hsiang2016econometrics;@carleton2016social]. We then employ these deconvolved series to examine the marginal effect of an anomalous one degree increase in temperature on anomalous changes in VMT, across each portion of the underlying temperature distribution. We depict the intuition behind our identification in `r fig_num("main", display="cite")`c. To smoothly combine the analyses in `r fig_num("main", display="cite")`c, we employ a cubic spline approach (see *Methods: Estimation of historical impacts*). Of note, we omit non-climatic control variables, such as socioeconomic covariates, from our estimation procedures because of their potential to generate bias in our parameters of interest -- a phenomenon known as a `bad controls' or a 'post-treatment controls' [@hsiang2013quantifying;@acharya2016explaining;@hsiang2016econometrics].

```{r}
main <- fig_num(name = "main", caption = "")
```

```{r fig.main, out.width="100%", fig.cap=paste0(fig_num("main", display="cite"),". \\textbf{Warmer temperatures increase vehicle miles traveled and public transit trips.} Panel a displays the marginal effects estimated from our splined regression model from Equation 1 of monthly maximum temperatures on the log of VMT. Warmer temperatures, up to approximately 30C, amplify VMT. Panel b depicts the estimated splined regression for the relationship between monthly maximum temperatures and the log of PTT. Warmer temperatures, again up to approximately 30C, amplify PTT. Error regions in panels a-b consist of 1,000 splined regressions, cluster bootstrapped on state or city from each sample, respectively. Panels c-d depict the deconvolved identifying variation in temperature and our deconvolved outcome measures that underlie the estimated results in panels a-b. We separate the deconvolved data into five-degree windows by level of normal state-month (panel c) and city-month (panel d) maximum temperatures. Anomalous increases in monthly maximum temperatures increase both VMT and PTT over most of the distribution of normal maximum temperatures. At the hotter end of the distribution, these relationships attenuate in magnitude. Lines in panels c-d represent linear fits within each temperature window. Panels e-f plot both the state demeaned and fully deconvolved VMT and temperature series, respectively, from our VMT data for a randomly sampled state. Panels g-h replicate panels e-f for a randomly sampled city from our PTT data.")}
knitr::include_graphics("./figure2.pdf")
```

`r fig_num("main", display="cite")`a displays the results of our VMT analysis (importantly, all models control for number of monthly days with measurable precipitation, monthly average diurnal temperature range, percentage cloud cover, relative humidity, and wind speed; see *Methods: Data*). Warmer temperatures cause VMT to increase predominantly linearly up to approximately `r round(vmt.main$x[vmt.main$b==0 & vmt.main$coef_norm==max(vmt.main$coef_norm)])`°C (the `r round(ecdf(vmt$tmax)(vmt.main$x[vmt.main$b==0 & vmt.main$coef_norm==max(vmt.main$coef_norm)]), 2)*100`th percentile of monthly temperature) and to decline slightly past this point. This effect is large in magnitude. Moving from freezing temperatures to `r round(vmt.main$x[vmt.main$b==0 & vmt.main$coef_norm==max(vmt.main$coef_norm)])`°C produces an approximately `r round((exp(mean(vmt.main$coef_norm[vmt.main$b==0 & vmt.main$x > 28.5 & vmt.main$x <= 29.5], na.rm=T)) - exp(mean(vmt.main$coef_norm[vmt.main$b==0 & vmt.main$x > -0.5 & vmt.main$x <= 0.5], na.rm=T)))*100)`% increase in VMT. Added warming from 30°C-40°C produces a change in VMT of approximately `r round((exp(mean(vmt.main$coef_norm[vmt.main$b==0 & vmt.main$x > 39], na.rm=T)) - exp(mean(vmt.main$coef_norm[vmt.main$b==0 & vmt.main$x > 29.5 & vmt.main$x <= 30.5], na.rm=T)))*100)`%, a smaller effect that is also estimated with higher statistical uncertainty. Thus, warmer temperatures amplify VMT across nearly the full temperature distribution.

Does temperature only impact vehicular transport, or does it more broadly impact transportation demand? To examine this question, we employ data on the monthly number of trips on public transit (PTT) for US metropolitan areas from the Federal Transit Administration, again linking these data to our meteorological variables of interest (see *Methods: Data*). These trips provide a measure of public transit ridership demand -- for both rail and bus -- across the entire US. Our data contain over 150 billion PTT. We analyze our PTT data using similar methods as for our VMT data (see *Methods: Estimation of historical impacts*). `r fig_num("main", display="cite")`g-h depict an example deconvolution of these data, and `r fig_num("main", display="cite")`d depicts the analyses of anomalies across levels of the temperature distribution.

`r fig_num("main", display="cite")`b displays the results of our PTT analysis. Warmer temperatures increase PTT linearly up to approximately `r round(upt.main$x[upt.main$b==0 & upt.main$coef_norm==max(upt.main$coef_norm)])`°C (the `r round(ecdf(upt$tmax)(upt.main$x[upt.main$b==0 & upt.main$coef_norm==max(upt.main$coef_norm)]), 2)*100`th percentile of monthly maximum temperatures) and cause PTT to decline minimally past this point. Like for VMT, the effect of temperature on PTT is large. Moving from freezing temperatures to `r round(upt.main$x[upt.main$b==0 & upt.main$coef_norm==max(upt.main$coef_norm)])`°C produces an approximately `r round((exp(mean(upt.main$coef_norm[upt.main$b==0 & upt.main$x > 30.5 & upt.main$x <= 31.5], na.rm=T)) - exp(mean(upt.main$coef_norm[upt.main$b==0 & upt.main$x > -0.5 & upt.main$x <= 0.5], na.rm=T)))*100)`% increase in public transit trips. Added warming from 31°C-40°C produces a change in PTT of approximately `r round((exp(mean(upt.main$coef_norm[upt.main$b==0 & upt.main$x > 39], na.rm=T)) - exp(mean(upt.main$coef_norm[upt.main$b==0 & upt.main$x > 30.5 & upt.main$x <= 31.5], na.rm=T)))*100)`%, an effect that is estimated with higher statistical uncertainty. The effects of temperature on PTT generally resemble the effects of temperature on VMT. Thus, warmer temperatures amplify demand for both vehicular transportation and public transit across the majority of the temperature distribution.

Do the estimated effects of temperature on transportation demand vary by the characteristics of localities? For example, do more urban or more rural places respond more acutely to warmer temperatures? Or do places with cooler average climates respond differently to warmer temperatures than do locales with warmer climates? To investigate questions of heterogeneous effects, we split our sample along the relevant variables and estimate our splined regression model for each subsample.

Performing these analyses, we find that the effect of temperature on VMT is nearly identical for VMT in urban areas as compared to VMT in rural areas (Fig. S1a), observe little difference in functional form between states with warmer climates and states with cooler climates (Fig. S1b), and note high similarity between those states with high rates of per-capita VMT as compared to those states with lower per-capita VMT (Fig. S1c). We observe little evidence of heterogeneous effects of temperature on vehicular transport. However, we observe more substantial evidence of heterogeneity in the effect of temperature on public transit demand. For example, the effects of temperature on rail-based transit are notably larger than on non-rail based transit (Fig. S1d). Added study is necessary to identify the precise mechanisms underlying this result. Further, we observe some evidence of potential adaptation with respect to climate for our PTT series; warmer cities observe more marked decreases in PTT due to cold temperatures than do cooler cities (Fig. S1e). We do not observe any substantial heterogeneity in effect, however, across cities with higher numbers of PTT per-capita as compared to those with lower numbers of PTT per-capita (Fig. S1f). Importantly, across each of our heterogeneous effects analyses we observe that warmer temperatures increase the demand for transportation -- both vehicular and via public transit -- across the bulk of the temperature distribution.

Thus, warmer temperatures increased US transportation usage over recent years. Might added anthropogenic warming amplify future US demand for transportation? To examine this question, we employ downscaled, monthly resolution climate model data from NASA Earth Exchange (see *Methods: Data*). We then extract these data to the administrative units from our analyses and employ the empirically estimated response functions between temperature and VMT and PTT, respectively. Using these data and our estimates, we examine the potential cumulative increases in VMT and PTT for each future month out to 2100, as well as the by-season and by-location heterogeneity in these projections (see *Methods: Projection of future impacts*). For each analysis, we examine projections across three emissions scenarios, RCP2.6, RCP4.5, and RCP8.5 (see `r fig_num("proj.plots", display="cite")`d). Importantly, our projections do not incorporate the potential for dynamic feedback between transportation emissions and warming nor do they incorporate projections of increased transport demand due to non-climatic factors. Our projections may thus represent a conservative lower-bound. 

```{r vmt_cumul}
state_projection <- readRDS("./state_projection_bootstrap.rds")

# calculate and plot by-month cumulative projection for rcp26/45/85----
## need to use 2006-2020 tmax and calculate differences by state then cumulate
mean2006_2020 <- state_projection[year(monthyr) >= 2006 & year(monthyr) < 2020, 
                                  .(base_rcp26_coef = mean(tmax_rcp26_coef),
                                    base_rcp45_coef = mean(tmax_rcp45_coef),
                                    base_rcp85_coef = mean(tmax_rcp85_coef)), 
                                  by=.(month(monthyr), state)]
state_projection[, month := month(monthyr)]
setkey(state_projection, state, month)
setkey(mean2006_2020, state, month)
state_projection <- merge(state_projection, mean2006_2020)
rm(mean2006_2020)

## get baseline vmt (millions) by state for 2013-2017 (most recent five-year period)
baseline_vmt <- vmt[year(monthyr) >= 2013 & year(monthyr) <= 2017, .(vmt = mean(vmt, na.rm=T)), by=.(state, month(monthyr))]
setkey(baseline_vmt, state, month)
setkey(state_projection, state, month)
state_projection <- merge(state_projection, baseline_vmt)
rm(baseline_vmt)

## calculate changes relative to baseline
## exponentiate coefficients to get percentage change, convert to vmt, difference from fixed baseline
state_projection[, ':=' (delta_rcp26 = exp(tmax_rcp26_coef)*vmt - exp(base_rcp26_coef)*vmt,
                         delta_rcp45 = exp(tmax_rcp45_coef)*vmt - exp(base_rcp45_coef)*vmt,
                         delta_rcp85 = exp(tmax_rcp85_coef)*vmt - exp(base_rcp85_coef)*vmt)] 

## keep only year 2020+
state_projection <- state_projection[year(monthyr) >= 2020]

## cumulate changes by state across month-years
setkey(state_projection, state, monthyr)
state_cumul_changes <- state_projection[, .(cumul_rcp26 = cumsum(delta_rcp26),
                                            cumul_rcp45 = cumsum(delta_rcp45),
                                            cumul_rcp85 = cumsum(delta_rcp85),
                                            monthyr = unique(monthyr)), by=.(state)]

## cumulate changes by full us
us_cumul_changes <- state_cumul_changes[, .(cumul_rcp26 = sum(cumul_rcp26),
                                            cumul_rcp45 = sum(cumul_rcp45),
                                            cumul_rcp85 = sum(cumul_rcp85)), 
                                        by=.(monthyr)]

dum <- data.frame(point = sort(rep(letters[1:3], 10)), x = us_cumul_changes$monthyr, y = runif(30))
us_cumul_plot <- ggplot(us_cumul_changes, aes(x=monthyr)) + 
                  geom_hline(aes(yintercept=0), linetype=2, color='gray80') +
                  geom_line(aes(y=cumul_rcp26/1000), color='#d5ed76', size=3) + 
                  geom_line(aes(y=cumul_rcp45/1000), color='gray60', size=3) + 
                  geom_line(aes(y=cumul_rcp85/1000), color='#FB8C00', size=3) +
                  ylab("Cumulative Added VMT in Billions of Miles") +
                  xlab("Calendar Month") +
                  geom_line(data = dum, aes(color = point, x = x, y = y), alpha = 0, size=4) + 
                  scale_color_manual(guide = guide_legend(override.aes = list(alpha = 1)),
                     limits=c("#d5ed76","gray60","#FB8C00"), 
                     values=c("#d5ed76","gray60","#FB8C00"),
                     labels=c("RCP2.6","RCP4.5","RCP8.5")) +
                  coord_cartesian(ylim=c(0,1000)) +
                  scale_y_continuous(breaks=c(0, 250, 500, 750, 1000)) +
                  theme(plot.title = element_text(face="bold", size=40),
                        axis.title.x = element_text(size=20, face="bold"),
                        axis.title.y = element_text(size=25, face="bold"),
                        axis.text.y = element_text(size=12, face="bold"),
                        axis.text.x = element_text(size=12, face="bold"),
                        panel.grid.major = element_blank(),
                        panel.grid.minor = element_blank(),
                        panel.background = element_blank(),
                        axis.line.x = element_line(),
                        axis.line.y = element_line(),
                        axis.ticks.y = element_line(),
                        legend.title=element_blank(),
                        legend.position = c(.2, 0.7),
                        legend.text=element_text(size=17, face="bold"),
                        legend.key = element_rect(colour = "black", fill="white")
                  )
```

```{r}
proj.plots <- fig_num(name = "proj.plots", caption = "")
```

```{r fig.projection.plots, out.width="100%", fig.cap=paste0(fig_num("proj.plots", display="cite"),". \\textbf{Climate change may amplify future vehicle miles traveled.} Panel a plots the by-state log of projected cumulative added VMT by 2050 and 2099 across emissions scenarios. Panel b displays the time-series of projected cumulative added VMT due to future warming over the entire US. Under RCP8.5, future US warming may produce over one trillion cumulative added VMT by 2100. Panel c plots historical maximum monthly temperatures from our sample and the peak of the relationship between maximum temperatures and log(VMT). Approximately 88\\% of historical temperatures fall below this peak. Panel d plots monthly temperature anomaly projections relative to the 2006-2020 baseline for each state from 2020 to 2100. Panels e-f plot sumtotal added VMT during each month of the year for 2040-2050 and 2090-2100.")}
knitr::include_graphics("./figure3.pdf")
``` 

Our resulting projections indicate that -- under higher rates of emissions (RCP8.5) -- future anthropogenic warming could produce a cumulative addition of approximately `r round(max(us_cumul_changes$cumul_rcp85)/1000000, 1)` trillion VMT in the United States alone by 2100 (`r fig_num("proj.plots", display="cite")`b). This represents `r plyr::round_any((max(us_cumul_changes$cumul_rcp85)/1000000)/(sum(vmt[vmt$year==2017,]$vmt*1000000, na.rm=T)/1000000000000)*100, 5)`% of the vehicle miles traveled in the most recent year in the VMT data. `r fig_num("proj.plots", display="cite")`a depicts that states with higher present amounts of VMT observe the largest absolute increases in the future. Further, because the peak of our historical function for VMT is towards the upper end of the historical temperature distribution (`r fig_num("proj.plots", display="cite")`c), the majority of the months of future years are associated with increases in VMT out to both 2040-2050 (`r fig_num("proj.plots", display="cite")`e) and 2090-2100 (`r fig_num("proj.plots", display="cite")`f). `r fig_num("proj.maps", display="cite")` depicts the spatial and temporal heterogeneity of our VMT projections for RCP8.5 in percent relative to baseline VMT (see *Methods: Projection of future impacts* and Fig. S3 and S4 for these projections across RCP2.6 and RCP4.5). We project future cooler months of the year to observe the largest increases in VMT, while we project warmer months to observe smaller in magnitude decreases in VMT.

Similarly, we project that future warming will also increase demand for public transit. Our projection indicates a cumulative increase of nearly six billion added PTT in 2100 under RCP8.5, with those states having high current rates of public transit seeing the largest increases (see Fig. S5). This figure represents over `r plyr::round_any((6000000000)/(sum(upt[upt$year==2017,]$trips, na.rm=T))*100, 5)`% of the PTT that occurred in the most recent year in the historical data. Further, our projections indicate a net increase in public transit demand across nearly all months of the year over the coming century. We depict our projections broken out by states and future months across emissions scenarios in Fig. S6-S8.

Climatic conditions alter myriad human behaviors, and recent data indicate that transportation behaviors are no exception. We find a robust causal relationship between temperature and the use of transportation in the US. Warmer temperatures, up to approximately 30°C, amplify both vehicle miles traveled and public transportation trips along with the transport emissions these activities produce. These results are present in a variety of contexts and persist across numerous robustness checks. Our projections employing climate model output suggest that warming over this century may amplify use of transportation in the US, especially during cooler seasons and in regions with already high levels of transportation usage. Importantly, this temperature-behavior dynamic poses the risk of a novel feedback loop in the human-environmental system. These conclusions, however, are subject to a number of considerations.

First, transportation is currently the leading sector for US greenhouse gas emissions [@unger2010attribution] and its relative contributions are likely to continue to grow [@nstprsc2008transportation]. Unless and until transport is decarbonized [@williams2011technology;@needell2016potential;@davis2018net] -- a process that poses notable challenges [@creutzig2015transport] -- any cumulative increases in VMT and PTT will translate directly into amplifications of cumulative anthropogenic greenhouse gas emissions. Because of the uncertainties associated with future transportation energy sources and resultant emissions, coupled with the difficulties of dynamically incorporating these feedbacks into the climate system, here we refrain from quantitative projection of likely emissions due to added future VMT and PTT. Future studies should seek to quantify the emission risk such a dynamic feedback loop might pose.

Second, our point estimates of future VMT and PTT due to feedback are relatively small in magnitude. Each future cumulative projection represents less than one year of current VMT (35%) and PTT (60%), respectively. These projections are likely conservative, however. For one, our empirical estimates of the relationship between temperature and transportation demand are likely downwards biased due to a combination of measurement error [@wooldridge2010econometric] and amplification of noise due to the deconvolution process we employ in our analyses [@blanc2017use]. In order to identify the causal relationship between temperature and transport, our procedure necessarily omits from its estimation any potential climate-driven seasonal effects of transportation [@hsiang2016econometrics]. Further, in our projections we pin VMT and PTT demand to its baseline average over the five year, 2013-2017 period. Transportation usage is likely to increase due to both population pressures and economic growth over the coming decades [@nstprsc2008transportation], amplifying any climate change related impacts. Moreover, our projections omit dynamic feedback-driven increases in transport due the anthropogenic forcing from instantaneous increases in emissions due to added transport usage. The feedback is, in reality, exponential in nature, though we model it in a static manner here.

Third -- even though we demonstrate increases in baseline mobility across the two primary modes of human transport in the US -- we are unable to definitively identify all variables mediating our results [@hsiang2016econometrics]. We hypothesize that the added transportation demand is due to an increased baseline demand for mobility (as evidenced by similar increases in physical activity rates [@obradovich2017climate]). However, precisely why demand for mobility increases in response to warmer temperatures -- as has now been observed in numerous locations around the world -- is an important question needing further investigation. Perhaps it is that individuals are spending less of their time resting or sleeping [@zivin2014temperature;@obradovich2017nighttime;@obradovich2018sleep], and therefore have more time to spend in activities require mobility. Or perhaps mobility behaviors are undertaken as a means of offsetting some of the mental health consequences of hotter temperatures [@obradovich2018empirical;@burke2018suicide].  Ultimately, future studies are needed to disentangle the root causes of this relationship.

Fourth, the effects we observe in our data may not hold in other countries or into the distant future. Extending this analysis to other countries where transportation demand is more rapidly growing -- such as China and India -- will be useful for examining the extent to which our findings here represent risk of a global feedback loop. Further, transportation in the future may look markedly different from its current form [@bonnefon2016social] a fact that adds to the uncertainty of our projections. Consideration of the effects we identify here in the context of shared socioeconomic pathways may be useful [@o2014new]. Additionally, humans may adapt to warmer climates in ways that mitigate or amplify future effects [@barreca2016adapting;@gosling2017adaptation]. We observe no empirical evidence of historical adaptation in our VMT series but cannot rule out the potential for future adaptation.

Fifth, while transportation represents the largest emissions sector in the US, it is only one of myriad potential areas where we may observe climate-behavior feedback loops. While feedback loops in the transportation sector may be of relatively low magnitude when taken alone, they must be considered within the full scope of potential climate-behavior feedback loops. For example, the effect of warming on net carbon emissions from the heating and cooling sectors [@auffhammer2014measuring;@davis2015contribution;@auffhammer2017climate] as well as any mitigation or adaptation actions intentionally taken in response to observed climatic changes [@beckage2018linking] represent other areas of important potential feedback worth additional study. Further quantification of the scope, magnitude, and uncertainties of these potential climate-behavior feedback mechanisms -- with attempts to incorporate them into projections of future warming -- is critical [@beckage2018linking].

Climate scholars investigate human contributions to climate change and climate change's role in shaping human systems separately. Yet these dynamics are intricately intertwined. Climate alters human emitting behaviors that contribute heavily to climate change. Current climate models incorporate anthropogenic inputs in a non-dynamic, linearly additive manner. The feedback we document here is dynamic and exponential in nature. By omitting these dynamics, models may be markedly underestimating anthropogenic forcing of the climate system, in turn producing lower-bound estimates of the future magnitude of the climate crisis.

```{r}
proj.maps <- fig_num(name = "proj.maps", caption = "")
```

```{r fig.projection.maps, out.width="100%", fig.cap=paste0(fig_num("proj.maps", display="cite"),". \\textbf{Effects of warming on vehicle miles traveled vary by state across future months.} This figure depicts the projected impacts of future warming on mean percentage change in VMT for the years between 2040 and 2050 and between 2090 and 2100 for the RCP8.5 scenario. Cooler months of the year observe the largest increase in VMT over these periods, while summer months observe decreases.")}
knitr::include_graphics("./figure4.pdf")
```

# Acknowledgements
We thank Daniel Alpert for his excellent research assistance and Patrick Baylis for sharing code. We also thank Patrick Baylis, Valentina Bosetti, Emilio Ferrara, James H. Fowler, Flavio Lehner, Kristina Lerman, Robyn Migliorini, Nicholas Millar, Frances Moore, David G. Victor, and seminar participants at USC's Information Sciences Institute for their helpful comments.

# Author contributions
N.O. constructed and analyzed the data, produced figures and tables, drafted the manuscript, and conceived of the research question. All authors refined the concept and edited the manuscript.

# Data availability

Replication data and materials are stored in Harvard's Dataverse at https://doi.org/10.7910/DVN/2LIINY.

# Funding statement

Support for this work was provided by the MIT Media Lab.

# Methods

## Data

**Outcome measures: ** We gather our VMT data from the US Federal Highway Administration, Office of Highway Policy Information's Traffic Volume Trends product. These monthly data are derived from approximately 4,000 continuous traffic counting stations across the US. Our unit of analysis for these data is the state-by-calendar month. We collect our PTT data from the US Federal Transit Administration's National Transit Database. These data are collected monthly from hundreds of metropolitan transit agencies around the country, and our unit of analysis is city (metropolitan area)-by-calendar month. Our data represent the standard data used by the US FHA and FTA in assessing traffic and public transit trends.

**Climatic data: ** We employ gridded (at ~4km) temperature and precipitation from the PRISM Climate Group [@di2008constructing] and average cloudcover, relative humidity, and wind speed from the National Centers for Environmental Prediction Reanalysis II project [@kanamitsu2002ncep]. Global circulation model projections employ NASA Earth Exchange's DCP-30 monthly maximum temperature ensemble average projections from 2006 to 2099 [@thrasher2012technical] across 21 of the CMIP5 models [@taylor2012overview] run on the RCP2.6, RCP4.5, and RCP8.5 emissions scenarios [@meinshausen2011rcp]. We average our climatic data to the state (for VMT) and metropolitan area (for PTT) boundaries of our sample. All shapefiles are drawn from the US Census Bureau's TIGER products.

## Estimation of historical impacts

**Empirical model:** Our relationships of interest are the marginal effects of monthly maximum temperatures on the logged values of VMT and PTT, respectively. We model this for VMT as:

```{r}
reg.eq <- eq_num(name = "reg.eq", caption = "")
```

$$\begin{aligned}
Y_{imy} = f(TMAX_{imy}) + \textbf{Z}\boldsymbol{\eta} + \alpha_{im} + \mu_{iy} + \epsilon_{imy} \ \ \ \ \ (1) \end{aligned}$$

In this cross-sectional, time-series model, $i$ indexes states, $m$ months of the year, and $y$ calendar years. Our dependent variable $Y_{imy}$ is continuous and represents the natural log of VMT in state $i$ in month $m$ within calendar year $y$. $TMAX_{imy}$ represents the monthly average of daily maximum temperatures. For PTT, $i$ indexes cities and $Y_{imy}$ represents the natural log of PTT; indexes are otherwise consistent across outcomes.

To allow flexible estimation of the marginal effects of temperature on our outcome measures, we employ a cubic spline approach, represented by $f()$, with splines set at the quartiles of the distribution of experienced maximum temperatures [@hsiang2016econometrics]. Further, we control for number of monthly days with measurable precipitation and monthly average diurnal temperature range, percentage cloud cover, relative humidity, and wind speed, represented via $\textbf{Z}\boldsymbol{\eta}$ as failure to do so may bias our primary estimates [@hsiang2016econometrics].

To ensure that time-invariant, state-specific (city-specific) factors or seasonal, state-specific (city-specific) factors do not bias our estimates of the effect of temperature on transportation outcomes, we include $\alpha_{im}$ -- representing non-parametric state-by-month (city-by-month) fixed effects -- in `r eq_num("reg.eq", display="cite")` [@hsiang2016econometrics]. Further, there may be state-specific (city-specific) time trends influencing our transportation outcomes that could spuriously correlate with meteorological conditions. To control for these potential state (city) specific trending confounds we include $\mu_{iy}$ in `r eq_num("reg.eq", display="cite")`, representing state-by-year (city-by-year) fixed effects.

After deconvolving our data around unit-by-month and unit-by-year fixed effects, the remaining variation in our meteorological variables is treated as good as randomly assigned to the remaining variation in our outcome measures [@hsiang2016econometrics;@carleton2016social]. The marginal effects of $f(TMAX_{imy})$ from this estimation can thus be interpreted as the causal effects of particular maximum temperatures on our transportation outcome measures [@hsiang2016econometrics;@carleton2016social].

We account for serial correlation in $\epsilon_{imy}$ by conducting 1,000 cluster bootstrapped splined regressions clustered on state for VMT and on city for PTT [@hsiang2016econometrics]. Further, we omit non-climatic control variables from `r eq_num("reg.eq", display="cite")` because they may generate post-treatment bias in our parameters of interest [@hsiang2013quantifying;@hsiang2016econometrics]. To estimate our models we use the *felm()* function from the *lfe* package in **R** [@gaure2013ols].

**Deconvolved data: ** To deconvolve our our temperature and VMT/PTT outcome measures to plot them as anomalies in `r fig_num("main", display="cite")` panels (c-h), we employ the same set of fixed effects as in `r eq_num("reg.eq", display="cite")` and use the *demeanlist()* function from the *lfe* package in **R** [@gaure2013ols;@hsiang2016econometrics].

**Robustness of estimation of historical impacts:** Our results are robust to varying the specification of maximum temperatures in our model (see Fig. S2) across both the VMT and PTT models. In each of these robustness checks, we substitute a separate way of parameterizing the estimation of the temperature functional form. We adjust for serial correlation in $\epsilon_{imy}$ in these robustness checks by employing heteroskedasticity-robust standard errors clustered on state for VMT and on city for PTT [@hsiang2016econometrics]. In our first robustness check (Fig. S2a and Fig. S2e), we employ the full distribution of daily maximum temperatures over the period by pulling in counts of the number of monthly days within each 5&#x2103; daily temperature bin [@deschenes2011climate]. We omit the 30&#x2103;-35&#x2103; maximum temperature bin  and interpret our estimates as the change relative to this baseline. Coefficient estimates in this model represent the effect on our logged outcome measures of having one additional day out of the month falling in the specified maximum temperature bin. In our second check (Fig. S2b and Fig. S2f), we employ separate indicator variables for each 5&#x2103; monthly average maximum temperature bin and again interpret these estimates relative to the 30&#x2103;-35&#x2103; omitted bin [@obradovich2017climate]. Coefficient estimates in this model represent the effect of being in one monthly maximum temperature bin on our logged outcome measures, relative to the omitted category of temperatures. Finally, we parametrically model monthly maximum temperature with a quadratic fit in Fig. S2c and Fig. S2g. We plot the marginal effects of these quadratic terms in Fig. S2d and Fig. S2h. They are highly significant across the cooler portions of the distribution of temperature, losing significance at the hottest portion of the distribution. Importantly, across each of these additional ways of specifying our functional form for maximum temperatures, the functional form recovered closely mirrors that recovered from our spline regression in the main text.

## Projection of future impacts

We produce monthly spatial averages for each US state for each month between 2006 and 2100 employing NASA's NEX DCP-30 ensemble average projection data across each of RCP2.6, RCP4.5, and RCP8.5. This gives us a state-by-month climate projection for every month out to 2100 across these three emissions scenarios.

To calculate our projections of added VMT depicted in `r fig_num("proj.plots", display="cite")` and `r fig_num("proj.maps", display="cite")`, we first fit the predicted marginal effects estimated from the splined fit in `r eq_num("reg.eq", display="cite")`, $\hat{f}()$, for each future month across each emissions scenario:

```{r}
fit.eq <- eq_num(name = "fit.eq", caption = "")
```

$$\hat{Y}_{imyr} = \hat{f}(TMAX_{imyr})\ \ \ \ \ (2)$$

Where $i$ again indexes states, $m$ indexes months, $y$ indexes years, and $r$ indexes the three emissions scenarios. With these log-scale fitted values, we then calculate our projected percentage change in vehicle miles traveled due to climate change for each future state-month-scenario. To do so, we employ the months of years 2006-2020 as our $TMAX$ baseline and calculate:

```{r}
pct.change.eq <- eq_num(name = "pct.change.eq", caption = "")
```

$$\hat{\Delta Y}_{imyr} = e^{\hat{f}(TMAX_{imyr})} - e^{\hat{f}(\overline{TMAX}_{2006-2020,\ imr})}\ \ \ \ \ (3)$$

Here, exponentiating our natural log-scale coefficients translates them into percentage terms. The difference of the two terms provides the projected percentage change in $\hat{\Delta Y}_{imyr}$. We plot the mean of this projected percentage change by-state over the periods 2040-2050 and 2090-2100 in `r fig_num("proj.maps", display="cite")`. Then, to calculate the change in absolute number of VMT, we multiply these percentage changes by the average state-month number of VMT over 2013-2017, the most recent complete five-year period in our data: 

```{r}
vmt.change.eq <- eq_num(name = "vmt.change.eq", caption = "")
```

$$\hat{\Delta VMT}_{imyr} = \hat{\Delta Y}_{imyr} * \overline{VMT}_{2013-2017,\ im}\ \ \ \ \ (4)$$

This procedure gives us a projected change in number of VMT across each state, future calendar month, and emissions scenario for years 2020-2100.

**Monthly added VMT:** To calculate the by-month additions to VMT across the full US from future warming we present in `r fig_num("proj.plots", display="cite")`d-e, we employ the output from `r eq_num("vmt.change.eq", display="cite")` and sum this output across each state across each ten-year future period (2040-2050 and 2090-2100) for each month and each emissions scenario:

```{r}
monthly.change.eq1 <- eq_num(name = "monthly.change.eq1", caption = "")
```

$$\begin{aligned}
\hat{\Delta VMT}_{2040-2050,\ mr} &= \sum_{y=2040}^{2050}\hat{\Delta VMT}_{myr}\\
\hat{\Delta VMT}_{2090-2100,\ mr} &= \sum_{y=2090}^{2100}\hat{\Delta VMT}_{myr}\ \ \ \ \ (5)
\end{aligned}$$

**Cumulative added VMT:** To calculate the cumulative US time-series of added VMT due to future warming presented in `r fig_num("proj.plots", display="cite")`a, we calculate the cumulative sum of our projections across states and across months for each future year and each emissions scenario:

```{r}
cumul.change.eq <- eq_num(name = "cumul.change.eq", caption = "")
```

$$\begin{aligned}
Cumul.\ \hat{\Delta VMT}_{yr} &= \sum_{k=2020}^{y}\hat{\Delta VMT}_{kr}\ \ \ \ \ (6)
\end{aligned}$$

**State-year cumulative added VMT:** Finally, to calculate the cumulative sum of added VMT due to future warming across states in a given future year, presented in `r fig_num("proj.plots", display="cite")`a, we modify `r eq_num("cumul.change.eq", display="cite")` to include a by-state index and select the specific years of 2050 and 2099: 

```{r}
state.cumul.change.eq <- eq_num(name = "state.cumul.change.eq", caption = "")
```

$$\begin{aligned}
Cumul.\ \hat{\Delta VMT}_{iyr} &= \sum_{k=2020}^{2050}\hat{\Delta VMT}_{ikr}\\
Cumul.\ \hat{\Delta VMT}_{iyr} &= \sum_{k=2020}^{2099}\hat{\Delta VMT}_{ikr} \ \ \ \ \ (7)
\end{aligned}$$

**Projection of PTT:** To estimate the possible impact of future climatic changes on PTT, we employ our empirically estimated function of the relationship between historical temperatures and historical PTT and couple this estimate with our NEX climate model projection data. Projections are conducted at the metropolitan area level and aggregated up to states for geographic visualization. We otherwise follow the projection procedure outlined above for VMT for the PTT projections.

# References
